r^ 



UNIPLUS+ SYSTEM V 
Administrator Guide 




SYST E M S 



PN; 1180-01 



Copyright ® 1984 UniSoft Corporation. 

Portions of this material have been previously copyrighted by: 

Bell Telephone Laboratories, Incorporated, 1980 

Western Electric Company, Incorporated, 1983 

Regents of the University of California 



Holders of a UNIX and UniPlus''" software license are permitted to copy this docu- 
ment, or any portion of it, as necessary for licensed use of the software, provided 
this copyright notice and statement of permission are included. 



UNIX is a Trademark of AT&T Bell Laboratories, Inc. 
UniPlus ■*" is a Trademark of UniSoft Corporation of Berkeley. 



PREFACE 



This guide is a reference for those who administer and operate the 
UniPlus"^ system. It contains a description of console operations and 
general instructions for normal operator and administrator functions as 
they apply to the family of MC68000 processors running the UniPlus"*" 
System V operating system. This guide should be used to supplement 
the information contained in the UniPlus^ System V User's Manual and 
the UniPlus^ System V Administrator's Manual. 

This guide contains 1 1 chapters: 

• INTRODUCTION 

• ADMINISTRATIVE ADVICE 

• MC68000/MC68010 OPERATIONS 

• START-UP PROCEDURES 

• SINGLE USER AND MULTIUSER MODE 

• DUTIES 

• SYSTEM ACCOUNTING 

• FSCK: FILE SYSTEM CHECKING 

• LP SPOOLING SYSTEM 

• SYSTEM ACTIVITY PACKAGE 

• UUCP ADMINISTRATION 

Chapter 1, INTRODUCTION, gives an overview of the system operator 
and administrator responsibilities. 

Chapter 2, ADMINISTRATIVE ADVICE, contains helpful advice and 
suggestions for system administrators of UniPlus"*". 

Chapter 3, MC68000/MC68010 OPERATIONS, explains some basic 
operations of MC68000/MC68010 computers. 



- 1 - 



PREFACE 

Chapter 4, START-UP PROCEDURES, explains how to start up your 
UniPlus+ system. 

Chapter 5, SINGLE USER AND MULTIUSER MODE, describes the 
two modes of operation of the UniPlus"*" operating system and the 
commands necessary to set the mode. 

Chapter 6, DUTIES, gives specific examples of duties performed by 
either a computer operator or a system administrator. 

Chapter 7, SYSTEM ACCOUNTING, describes the structure, imple- 
mentation, and management of the accounting system. 

Chapter 8, FSCK: FILE SYSTEM CHECKING, describes the file sys- 
tem check program (fsck) of the UniPlus"*" system. Fsck audits and 
interactively repairs inconsistency in the file system. 

Chapter 9, LP SPOOLING SYSTEM, defines the Ip program and 
describes the role of the LP administrator in performing restricted func- 
tions and overseeing the smooth operation of Ip. 

Chapter 10, SYSTEM ACTIVITY PACKAGE, describes the design and 
implementation of the UniPlus"*" system activity package. The package 
reports UniPlus"*" system-wide statistics. 

Chapter 11, UUCP ADMINISTRATION, describes how a uucp net- 
work is set up, the format of the control files, and administrative pro- 
cedures. 

Throughout this guide, each reference of the form nameilM), 
named), or name(S) refers to entries in the UniPlus^ System V 
Administrator's Manual. All other references of the form name(N), 
where A^ is a number, possibly followed by a letter, refer to entries in 
section A^ of the UniPlus^ System V User's Manual. 



n - 



CONTENTS 



Chapter 1 INTRODUCTION 

Chapter 2 ADMINISTRATIVE ADVICE 

Chapter 3 MC68000/MC68010 OPERATIONS 

Chapter 4 START-UP PROCEDURES 

Chapter 5 SINGLE USER AND MULTIUSER MODE 

Chapter 6 DUTIES 

Chapter 7 SYSTEM ACCOUNTING 

Chapter 8 FSCK: FILE SYSTEM CHECKING 

Chapter 9 LP SPOOLING SYSTEM 

Chapter 10 SYSTEM ACTIVITY PACKAGE 

Chapter 11 UUCP ADMINISTRATION 



Chapter 1: INTRODUCTION 

CONTENTS 

1. General 1 

2. System Console 1 

3. Input/Output Notations 2 

4. Local Needs 2 



Chapter 1 
INTRODUCTION 



1. General 



In this guide, procedures and examples are given for starting up your 
system (booting and powering) , changing run levels (that is, single user 
and multiuser), saving and restoring files, bringing down the system in 
an orderly manner, and restoring the system after a crash. You should 
always consult documentation for your processor before performing any 
of the procedures in this guide. 

2. System Console 

Most of the operations you do will involve the system console. All 
messages to the operator and input from the operator are via the sys- 
tem console. You will be using the system console in one of three 
modes: 

• Monitor/Boot — The UniPlus''" operating system is halted. In 
this optional mode, a monitor or stand alone operating system 
may be available to operate the processor and load in the boot 
program, or the boot program may be already running. See the 
software and hardware reference manuals for your computer for 
initial procedures and monitor commands. 

• Single user — The UniPlus'*' operating system is executing. The 
commands you enter on the system console are UniPlus"'' system 
commands. In single-user mode you are always super-user. 
When the system is halted or in single-user mode, the console is 
the only interface to the system, unless you specifically change the 
configuration so that another terminal acts as a console. 

• Multiuser — The UniPlus"'' operating system is executing. The 
system console (and any other configured terminal) is treated as a 
normal user terminal. 

In halt mode or single-user mode, the console will not be treated as a 
login terminal (therefore, you are super-user). When you change the 
system to multiuser mode, a login message will appear on the console. 
You must provide a login and password at this point in order to use the 
console. Normally you should log in as root. Here, it must be 



1-1 



INTRODUCTION 

mentioned that the login you use is a local decision. In fact, the sys- 
tem administrator may configure your system so that it is not even 
necessary for you to log in after changing to multiuser. 

Normal daily maintenance requirements are described and examples 
provided of normal operations (not including local procedures). For 
more information on the console (for example, set-up procedures), 
consult your console terminal owner's manual. 

3. Input/Output Notations 

Throughout this guide, the following notation is used for computer 
input/output: 

1 . Special characters are in all caps (for example, when you see CON- 
TROL read this as the "control" or "CTRL" keyboard character 
and RETURN as the "carriage return" key). 

2. Items within I Is are optional. 

3. You should type in literally any indented command field that 
appears boldface (a keyword). 

4. You should substitute with the appropriate information any com- 
mand field that appears in italics. 

5. All commands (system or console commands) should be ter- 
minated with a carriage return. 

4. Local Needs 

Because this guide is intended to be as general as possible, no 
machine-specific or installation-specific information has been included. 
Also, some operations may vary according to local procedures. It is 
suggested that you add specific information about: 

• Hardware configuration 

• Software configuration of administrative files 

• Data. set configuration 

• Specific logging and record-keeping practices 

• Contacts for hardware and software problems 

• Site-dependent diagnostic procedures. 



1-2 



Chapter 2: ADMINISTRATIVE ADVICE 

CONTENTS 

1. Introduction 1 

2. Administrator's Road Map 1 

3. A Few Words About System Tuning 2 

4. File System Backup Programs 2 

5. Controlling Disk Usage 4 

6. Reorganizing File Systems 5 

7. Keeping Directory Files Small 5 

8. Administrative Use of "cron" 6 

9. Watch Out for Files and Directories that Grow 6 

10. Allocating Resources to Users 7 

1 1 . The Matter of Accounting and Usage 7 

12. Dial-Line Utilization 8 

13. "Bird-Dogging" 8 

14. Terminals 8 

15. Line Printers 8 

16. Security 8 

17. Communicating with the Users 9 

18. Null Modem Wiring 9 

LIST OF FIGURES 

Figure 2.1. File System Backup Programs 3 



- 1 - 



Chapter 2 
ADMINISTRATIVE ADVICE 



1. Introduction 

The information contained in this chapter is relative to 
MC68000/MC68010 processors. 

2. Administrator's Road Map 

This chapter contains administrative advice based on the experience and 
suggestions of many system administrators. Other reasonable 
approaches may be taken to solve many of the problem areas described. 



Getting started as a UNIX system administrator is hard work. There are 
no real shortcuts to a working knowledge of the system. The system 
administrator will need time for reading, studying, and hands-on exper- 
imenting. The system administrator should not go "live" with the sys- 
tem until he/she have had several weeks to learn the job and get the 
initial hardware quirks ironed out. 

The administrator should be familiar with most of the distributed docu- 
mentation. All of the sections of the UniPlus^ System V Administrator's 
Manual should be studied. 

Pay special attention to the following in the UniPlus^ System V 
Administrator's Manual and UniPlus^ System V User's Manual: 



2-1 



ADMINISTRATIVE ADVICE 



chmod(l) 

chown(l) 

cpio(l) 

dated) 

du(l) 

ed(l) 

env(l) 

find(l) 

kilKl) 

acct(lM) 

checkalKlM) 

dcopy(lM) 

df(lM) 
errpt(lM) 

fsck(lM) 
fuser(lM) 

acct(4) 

all of section 7 

crash (8) 



mail(l) 
mkdir(l) 

ps(l) 

rm(l) 

rmdir(l) 

su(l) 
timed) 
whod) 
writed) 

mkfs(lM) 
ncheck(lM) 
shutdown (IM) 

sync(lM) 

volcopy(lM) 

walKlM) 



3. A Few Words About System Tuning 

A file system reorganization can help throughput but at the expense of 
down time. If the reorganization is done during nonprime time, it can 
help. 



If normal shutdown and filesave procedures are used, the file system 
check program [fsck(lM), — S option] will help keep the disk free list 
in reasonable order. Try to keep disk drive usage balanced. If there 
are over 20 users, the root file system (/bin, /tmp, and /etc) deserves 
a drive of its own. If there is a noisy modem (poorly executed do-it- 
yourself null-modem) or a disconnected modem cable, the UniPlus''" 
system will spend a lot of CPU time trying to get it logged in. A ran- 
dom check of systems uncovers a lot of this going on. 

4. File System Backup Programs 

The following backup programs are distributed: 



2-2 



ADMINISTRATIVE ADVICE 

• Find/cpio: The UniPlus"*" system is distributed in cpio format. 
The —cpio option of the find command can be used for saving 
only those files that have changed or been created over a definite 
period. 

• Volcopy: Physical file system copying to disk or tape. For those 
with a spare drive, volcopy to disk provides convenient file restore 
and quick recovery from disk disasters. Tape volcopy provides 
good long-term backup because the file system can be read-in 
fairly quickly, mounted, and browsed over. Disk and tape volcopy 
are generally used together for short- and long-term backup. 
Note that a volcopy from a mounted file system may result in an 
inconsistent copy (files being written at the time can contain 
invalid data). 

Figure 2.1 summarizes attributes of these programs. In the figure, the 
file system size is 65,500 KB in all cases; times are in minutes; judge- 
ments are subjective. 





FIND/CPIO 


VOLCOPY (DISK) 


VOLCOPY (TAPE) 


Full dump time 


40 


2 


15 


Incremental dump time 


7 


- 


- 


Full restore time 


80 


2 


15 


Incremental restore time 


10 


- 


- 


Ease of restoring: 








one file 


fair 


good 


fair 


a directory 


fair 


good 


good 


scattered files 


poor 


good 


good 


full restore 


fair 


very good 


good 


Needs tape drive 


yes 


no 


yes 


Needs spare file system 








(two CPUs can share) 




yes 


- 


Maintains pack/tape labels 


no 


yes 


- 


Handles multireel tape 


yes 


- 


yes 


512 KB per record 


1.10 


88 


10 


Interactive 








(i.e., ties up console) 


yes 


yes 


yes 


May require separate 








I/D space 


no 


no* 


no 



* KB per record are cut to 22 without separate I/D space. 

Figure 2.1. File System Backup Programs 

The spare disk drive is strongly recommended. The speed and conveni- 
ence of volcopy are by no means the only advantage of a spare drive. 



2-3 



ADMINISTRATIVE ADVICE 

It is strongly recommended that the administrator modify the 
/etc/filesave and /etc/checklist files to meet the operational needs and 
update the local operator's manual accordingly. Remember, the more 
the administrator automates and documents operational procedures, the 
less downtime will be encountered. 

5. Controlling Disk Usage 

Once the UniPlus"*" system is a success, disk space will soon become 
limited. During the long delay before more drives become available, 
usage should be controlled. Try to maintain the start-of-day counts 
recommended. Watch usage during the day by executing the df(l) 
command regularly. 

The du(l) command should be executed (after hours) regularly (e.g., 
daily), and the output kept in an accessible file for later comparison. In 
this way, users rapidly increasing their disk usage may be spotted. This 
can also be accomplished by running the accounting system's acctdusg 
program. 

The find(l) command can be used to locate inactive (or large) files. 
For example: 

find / — mtime +90 — atime +90 —print >somefile 

records in "somefile" the names of files neither written nor accessed in 
the last 90 days. 

The administrator will also have to balance usage between file systems. 
To do this, user directories must be moved. Users should be taught to 
accept file system name changes (and to program around themprefer- 
ably ahead of time). The user's login directory name (available in the 
shell variable HOME) should be utilized to minimize pathname depen- 
dencies. User groups with more extensive file system structures should 
set up a shell variable to refer to the file system name (e.g., FS). 

The find(l) and cpio(l) commands can be used to move user direc- 
tories and to manipulate the file system tree. The following sequence is 
useful (it moves the directory trees userx and usery from file system 
filesysl to file system filesys2 where, presumably, more space is avail- 
able): 



2-4 



ADMINISTRATIVE ADVICE 



cd /filesysl 

find userx usery —print | cpio — pdm /filesys2 

# Make sure new copy is OK. 

# Change userx and usery login directories 

# in the /elc/passwd file. 

# Notify userx and usery via mail(l) that 

# they have been moved and that pathname 

# dependencies in their .profile and shell 

# procedures may need to be changed. See the 

# discussion on $HOME above, 
rm —rf /filesysl /userx /filesysl /usery 

When moving more than one user in this way, keep users with com- 
mon interests in the same file system (these users may have linked 
files) and move groups of users who may have linked files with a single 
cpio command (otherwise linked files will be unlinked and duplicated) . 

6. Reorganizing File Systems 

There is a new file system reorganization utility called dcopy(lM). On 
an otherwise idle system, a reorganized file system has almost twice the 
I/O throughput of a randomly organized file system. This applies to file 
copying, finds, fscks, etc. Dcopy can take up to 2.5 hours to initially 
reorganize (copy) a large file system. During reorganization, the sys- 
tem can be up, but the file system being copied must be unmounted. 

For those who can afford the operator time, root reorganization once a 
week (requires system reboot) and user file system reorganization once 
a month will improve system performance. Dcopy is an interim step. 

7. Keeping Directory Files Small 

Directories larger than 5K bytes (320 entries) are very inefficient 
because of file system indirection. A UNIX system user once com- 
plained that it took the system 10 minutes to complete the login pro- 
cess; it turned out that his login directory was 25K bytes long, and the 
login program spent that time fruitlessly looking for a nonexistent 
".profile" file. A large /usr/mail or /usr/spool/uucp directory can 
also really slow the system down. The following will ferret out such 
directories: 

find / —type d —size +10 —print 



2-5 



ADMINISTRATIVE ADVICE 

Removing files from directories does not make the directories get 
smaller (the empty directory entries are available for reuse). The fol- 
lowing will "compact" /usr/mail (or any other directory): 

mv /usr/mail /usr/omail 

mkdir /usr/mail 

chmod 777 /usr/mail 

cd /usr/omail 

find . —print i cpio — plm ../mail 

cd.. 

rm — rf omail 

8. Administrative Use of "cron" 

The program cron(lM) is useful in the administration of the system; it 
can be used to: 

• Turn off the programs in directory /usr/games during prime time. 

• Run programs off-hours: 

- accounting; 

- file system administration; 

- long-running, user-written shell procedures. 

9. Watch Out for Files and Directories that Grow 

Most of the files below are restarted automatically by entries in /etc/re 
at system reboot. 

• Accounting files: 

— /etc/wtmp— login information; grows extremely fast with 
terminal line difficulties; use acctcon(lM) to determine the 
offending line(s). 

— /usr/adm/pacct— per process accounting records; gets big 
quickly; monitored automatically by ckpacct from cron(lM). 

— /usr/lib/cron/log— status log of commands executed by 
cron(lM); also watch this file for error messages from the 
programs being executed in /usr/spool/cron/crontab/*. 

— /usr/adm/errfile — hardware error logging info; also read 
login adm's mail periodically. 

— /usr/adm/ctlog— a log of the people who use ct(lC) com- 
mand. 



2-6 



ADMINISTRATIVE ADVICE 

— /usr/adm/sulog— a log of those who execute the superuser 
command. 

— /usr/adm/Spacct— process accounting files left over from 
an accounting failure; remove these files unless the account- 
ing files that failed are to be rerun. 

• Other files: 

— /usr/spool— spooling directory for line printers, uucp(lC), 
etc., and whose subdirectories should be compacted as 
described above. 

10. Allocating Resources to Users 

A prospective user should first obtain authorization to use the system 
and then apply for a login by providing the following information to the 
System Administrator: 

• User's name. 

• Suggested login name (not more than eight characters, beginning 
with a lowercase letter and not containing special or uppercase 
letters) . 



• 



• 



Relationships to other users (this influences the choice of the file 
system) . 

Estimate of required file space (this also influences the choice of 
the file system) and connect hours. This aids in hardware growth 
planning. 



Users must have passwords with at least six characters. (Only the first 
eight characters are significant.) Also, every password must have at 
least two alphabetic characters and one numeric or special character. 
The password must differ from the user's login name and any reverse 
or circular shift of it. Refer to passwd(l) and passwd(4) for more 
information on password selection and password aging. 

11. The Matter of Accounting and Usage 

You should run the accounting programs even if there is not a "bill" 
for service. Otherwise, users' habits (especially bad habits) will be a 
mystery to you. Accounting information can also help you find perfor- 
mance bottlenecks, unused logins, bad phone lines, etc. 



2-7 



ADMINISTRATIVE ADVICE 



12. Dial-Line Utilization 

If prime-time dial-line utilization gets much over 70 percent, users will 
start to encounter busy signals when dialing in. This, in turn, will lead 
to "line hogging". The only solutions are to acquire more dial-up 
ports, get a larger (another) machine, or to get rid of users. Manual 
policing will help some, but "automatic" policing will be invariably sub- 
verted by users. 

13. "Bird-Dogging" 

When the system is busy (lines busy and/or slow response), someone 
should determine why this is so. The who(l) command lists the people 
logged in. The ps(l) command shows what they are doing. Unfor- 
tunately, ps operates from heuristics that can consistently fail to report 
certain processes in a busy system. That is, one must be careful about 
hanging up an apparently inactive line. The acctcom(lM) command 
can read the process accounting file /usr/adm/pacct backwards from 
the most recent entry. It will print entries for selected lines or login 
names. 

14. Terminals 

Do not use uppercase only terminals. Use full-duplex, full-ASCII asyn- 
chronous terminals. Hardware horizontal tabbing is very desirable 
because it increases output speed and lowers system overhead. A fair 
proportion of the terminals should provide for correspondence-quality 
hard copy output to take advantage of the UniPlus"*" system word pro- 
cessing capabilities; see term (5). 

15. Line Printers 

Most line printers are troublesome and impose considerable overhead 
on the system. Most also lack hardware tabs, character overstrike capa- 
bility, etc. A printer that will work over an asynchronous link 
(DC1/DC3 protocol required) is the best bet. 

16. Security 

The current UNIX operating system is not tamperproof. The system 
administrator cannot keep people from "breaking" the system but can 
usually detect that they have done so. The following command will 
mail (to root) a list of all "set user ID" programs owned by root 
(superuser): 



2-8 



ADMINISTRATIVE ADVICE 

find / —user root —perm —4100 —exec Is —1 {} \; I mail root 

Any surprises in root's mail should be investigated. In dealing with 
security, 

• Change the superuser password regularly. Do not pick obvious 
passwords (choose 6-to-8 character nonsense strings that combine 
alphabetics with digits or special characters) . 

• Dial ports that do not require passwords usually cause trouble. 

• The chroot(lM) and su(l) commands are inherently dangerous as 
are group passwords. 

• Login directories, ".profile" files, and files in /bin, /usr/bin, 
/Ibin, and /etc that are writable by others than their respective 
owners are security weak spots; police the system regularly against 
them. 

• Remember, no time-sharing system with dial ports is really 
secure. Do not keep top secret information on tlie system. 

17. Communicating with the Users 

The directory /usr/news and the news(l) command are provided as a 
way to get "brief" announcements to your users. More pressing items 
(one-liners) can be entered in the /etc/motd (message of the day) file; 
motd and (new to the user) news are announced at login time. 

To reach users who are already logged in, use the wall(lM) (write all) 
command. Do not use wall while logged-in as superuser, except in 
emergencies. 

The /usr/news directory should be cleaned out once a week by remov- 
ing everything older than 2 months. It has been found that on most 
systems a file in /usr/news will reach 50 percent of the users within a 
day and over 80 percent within a week; motd should be cleaned out 
daily. 

18. Null Modem Wiring 

Improperly wired null modems can cause spurious interrupts, especially 
at higher baud rates. A single bad modem on a 9600-baud line can 
waste 15 percent of your CPU power. The following (symmetrical) 



2-9 



ADMINISTRATIVE ADVICE 



wiring plan will prevent such problems: 

pin 1 to 1 

pin 2 to 3 

pin 3 to 2 

strap pin 4 to 5 in the same plug 

pin 6 to 20 

pin 7 to 7 

pin 8 to 20 

pin 20 to 6 and 8 

ground unused pins 



2-10 



Chapter 3: MC68000/MC68010 OPERATIONS 

CONTENTS 

1. Introduction 1 

2. Booting 1 

3. Shutting Down 2 

4. Powering Down 2 



- 1 



Chapter 3 
MC68000/MC68010 OPERATIONS 



1. Introduction 

Information on system operations should be obtained from the 
manufacturer of your box. Console commands and start-up procedures 
vary, depending on hardware configurations. 

2. Booting 

In general, a boot program is used to start up UniPlus"^. This boot pro- 
gram can reside in PROM, or on a floppy, or in the beginning of a hard 
disk. The boot program must first find out where UniPlus"*" resides 
either by looking at a specific place on the disk, or prompting the user 
for this information. Once UniPlus"^ is located on the file system, the 
boot program will load it from disk to memory. For specific booting 
instructions, refer to the manual from the manufacturer of your box. 

Once loaded, the UniPlus^ operating system is ready to come up. The 
system will scan the ktc/inittab file to determine among other things, 
which run level will be entered. If this file specifies a run level (or a 
default level is found), the system will enter the run level specified. 
Otherwise, do the following steps: 

1 . This message should appear on the console: 

ENTER RUN LEVEL (0-6, s or S): 

Enter 2<cr> to go to multiuser state, or s<cr> to go to single 
user state. 

2. If you requested multiuser in step 1, the system will ask you to 
verify the date. Then you will be asked if the file systems are to 
be checked. Finally, the following message will be printed on the 
console: 

Console Login: 

If you requested single user in step 1, the # prompt will be 
printed. In this case, typing telinit 2 will change the operating 
system state to multiuser. 



3-1 



OPERATIONS 

3. Shutting Down 

The shutdown procedure is designed to gracefully turn off all processes 
and bring the system back to single user state with all buffers flushed. 
To do this you should execute shutdown as described in Chapter 6. If 
shutdown is not successful, use the following sequence of commands: 

killall 

sync 

init S 

fsck This is optional 

4. Powering Down 

The shutdown sequence should always be run before powering down. 
Disk drives, where they require separate powering, should be powered 
down before powering down the processor. Refer to instructions from 
the manufacturer for any other specific procedures. 



3-2 



Chapter 4 
START-UP PROCEDURES 



Below is a description of how to start up your UniPlus'*' system. A 
variety of procedures may be necessary to start the system. The pro- 
cessor and peripherals (such as disk drives) may need to be powered 
up. Additionally, a combination of hardware and software resets and 
monitor commands may be required. The final step in starting up the 
system is generally the boot. The boot procedure loads a copy of the 
UniPlus+ operating system from disk, floppy, tape, or some other 
media into memory and executes it. 

You will need to reboot the UniPlus"*" operating system when one of 
the following conditions occur: 

• system crash or restart; 

• loading of a new software release; or 

• updating of the software release. 

Once loaded, the UniPlus''" operating system will typically enter the 
single-user "run level" awaiting your commands. When properly 
configured by the system administrator, the UniPlus"*" operating system 
uses init to automatically enter the final run level. Run levels are dis- 
cussed in the "Single User and Multiuser Mode" chapter of this guide. 
Normally, run level s indicates single user and 2 indicates multiuser. 
For more information on init refer to init(lM) in the UniPlus^ System 
V Administrator's Manual, inittab(4) in the UniPlus^ System V User's 
Manual, or, if you are an operator, consult the local system administra- 
tor. 

See the relevant software or hardware reference manual for your com- 
puter for detailed powering and booting procedures. 



4-1 



Chapter 5: SINGLE USER AND MULTIUSER MODE 



CONTENTS 



1. Introduction 1 

2. Single-User Environment 1 

2.1 The fsck Command 2 

2.2 The telinit 2 Command 4 

3. Muhiuser Environment 5 



1 - 



Chapter 5 
SINGLE USER AND MULTIUSER MODE 

1. Introduction 

There are two main modes of operation of the UniPlus"*" operating sys- 
tem: single user (level S) and multiuser (level 2). The run level has 
eight possible values: 0-6 and S (or s). Single user is always S or s. 
Although multiuser is normally level 2, the system administrator can 
configure the /etc/inittab file to run multiuser at any level from to 6. 

The /etc/inittab file can also be configured so that certain procedures are 
followed automatically only the first time that a certain run level is 
entered. For example, normally you will be asked to verify date and 
file systems the first time you change your system to multiuser. This is 
caused by an entry in the inittab file. Subsequent changes in run level 
will not perform this procedure automatically unless you specifically 
change the inittab file. For more information on init refer to init(lM) 
in the UniPlus^ System V Administrator's Manual, inittab (4) in the 
UniPlus^ System V User's Manual, or, if you are an operator, consult 
your local system administrator. 

When in single-user mode, all dial-up ports and hard-wired terminals 
are disabled and only the console terminal may interact with the proces- 
sor. This mode of operation allows you to make necessary changes to 
the system without any other processing taking place. However, you 
will normally run the UniPlus"*" operating system in multiuser mode. 
Consult the documentation for your particular processor before 
proceeding with any of these procedures. 

2. Single-User Environment 

In single-user mode, you may type any available system command (fol- 
lowed by a return). When the system has completed execution of 
the command, it will prompt with the "#" again on the next line. You 
use the single-user environment primarily to do filesaves, system 
maintenance, modification, or repair operations. The typical sequence 
of commands to change the system to multiuser mode is: 

1. fsck 



5-1 



USER MODE 

2. telinit 2 

2.1 The fsck Command 

The command fsck will interactively repair any damaged file systems 
that result from a crash of the operating system. You should also use it 
to ensure that the file systems are not damaged before going into mul- 
tiuser mode or taking filesaves. Usually, you will want to respond 
"yes" to all the prompts; however, in the event of a system crash, the 
damage may be extensive enough to warrant recovery from a backup 
pack. The procedure for this is discussed under "Filesaves" in Chapter 
6. See fsck in the UniPlus^ System V Administrator's Manual for details 
on the various options available and Chapter 8 in this guide for a 
description of all the different errors that can occur. 

An example of a check of a consistent file system is illustrated below: 

# fsck /dev/rsmdl 

/dev/rsmdl 

File System: usr Volume: p0603 

** Phase 1 — Check Blocks and Sizes 

** Phase 2 — Check Pathnames 

** Phase 3 — Check Connectivity 

** Phase 4 — Check Reference Counts 

** Phase 5 — Check Free List 

2441 files 16547 blocks 31889 free 

# 

A file system that has been damaged can be repaired as shown below. 
The y is your response. When checking a file system, you can avoid 
the questions asked by fsck concerning inconsistencies found by using 
the y option. This option will automatically attempt repairs as though 
you answered "yes" to the questions. Use this with caution— the 
corrections usually involve some data loss. If you decide to interac- 
tively repair the file system, then follow the example below: 

# fsck /dev/rsmd2 

The UniPlus"*" operating system responds: 



5-2 



USER MODE 

/dev/rsmd2 

File System: fsl Volume: p0603 

** Phase 1 — Check Blocks and Sizes 

POSSIBLE FILE SIZE ERROR 1 = 2500 

** Phase 2 — Check Pathnames 

** Phase 3 — Check Connectivity 

** Phase 4 — Check Reference Counts 

UNREF FILE 1 = 2500 OWNER = 255 MODE = 100755 

SIZE = MTIME = Dec 31 19830 1983 

CLEAR? y 

** Phase 5 — Check Free List 

2441 files 16547 blocks 889 free 

***** FILE SYSTEM WAS MODIFIED ***** 

# 

All mountable file systems should be listed in the file /etc/checklist 
which fsck uses, and you should check these file systems each time the 
system is rebooted. 

A faster alternative to using fsck is checkall. The checkall command 
uses dfsck (a front end for fsck) to simultaneously check two file sys- 
tems in different disk drives. Included in checkall are the file system 
names that normally appear in /etc/checklist (see checkall in the 
UniPlus'^ System V User's Manual). 

WARNING: Never execute fsck on a mounted file system; it will 
have a bad effect since you are repairing only the physical disk. The 
only exception to this is the root file system which is always mounted. 

An example of repairing the root file system follows: 



5-3 



USER MODE 

# fsck /dev/smdO 

/dev/smdO 

File System: root Volume: pOOOl 

** Phase 1 — Check Blocks and Sizes 

POSSIBLE FILE SIZE ERROR 1 = 416 

POSSIBLE FILE SIZE ERROR 1 = 610 

POSSIBLE FILE SIZE ERROR 1 = 614 

POSSIBLE FILE SIZE ERROR 1 = 618 

POSSIBLE FILE SIZE ERROR 1 = 625 

** Phase 2 — Check Pathnames 

** Phase 3 — Check Connectivity 

** Phase 4 — Check Reference Counts 

UNREF FILE 1 = 416 OWNER = uucp MODE = 100400 

SIZE = MTIME = Nov 20 16:23 1983 

CLEAR? y 

UNREF FILE 1 = 610 OWNER = csw MODE = 100400 

SIZE = MTIME=Nov 20 16:26 1983 

CLEAR? y 

UNREF FILE 1 = 625 OWNER = cath MODE=100400 

SIZE = MTIME = Nov 20 16:26 1983 

CLEAR? y 

FREE INODE COUNT WRONG IN SUPERBLK 

FIX? y 

** Phase 5 — Check Free List 

1 DUP BLKS IN FREE LIST 

BAD FREE LIST 

SALVAGE? y 

** Phase 6 — Salvage Free List 

585 files 5463 blocks 4223 free 

***** BOOT UNIX (NO SYNC !) ***** 

# 

At this time you must immediately halt the processor and then reboot 
the system (see the relevant software or hardware reference manual for 
your computer for start-up procedures.) 

2.2 The telinit 2 Command 

After you have checked the file systems, you may change the UniPlus+ 
operating system to multiuser. Do this by entering the command tel- 
init 2. This command activates processes that allow users to log in to 
the system, turn on the accounting and error logging, mount any 



5-4 



USER MODE 

indicated file systems, and start the cron and any indicated daemons. 
Depending upon the type of data set your site has, you may have to 
manually flip the toggles or pop the buttons on the data sets to allow 
users to log in. 

3. Multiuser Environment 

There are two ways to get to this level: by typing telinit 2; or, specify- 
ing a run level of 2 after the boot. Users are permitted to access all 
mounted file systems and execute all available commands. In this 
mode, you can perform file restore procedures and take periodic status 
checks of the system. Some of these periodic status checks can include: 

• A check of free blocks (df) remaining on all mounted file systems 
to ensure that a file system does not run out of space. 

• A check on mail to root or whatever login receives requests for 
file restores. 

• A check on the number of users on the system (who). 

• A check of all running processes ("ps — eaf" or whodo) to deter- 
mine if there is some process using an abnormally large amount 
of CPU time. 

If your site has other run levels defined, you can use the telinit com-- 
mand to change to those run levels. Finally, to change a multiuser sys- 
tem to single user, refer to "System Shutdown" in Chapter 6. 



5-5 



Chapter 6: DUTIES 



CONTENTS 



1. Introduction 1 

2. Filesaves 1 

2.1 Saving the Root File System on Disk 2 

2.2 Saving the User File System on Disk 3 

2.3 Saving the User File System on Tape 3 

3. File Restores 5 

3.1 Restoring from Disk 5 

3.2 Restoring from Tape 6 

4. Message of the Day 10 

5. System Shutdown 11 

5.1 Shutdown Program 11 

6. System Crash Recovery 12 



1 - 



Chapter 6 
DUTIES 



1. Introduction 

This chapter is a guide for the normal duties of a computer operator or 
system administrator. These descriptions do not represent what specific 
job duties are; they merely outline the general procedures to ensure 
that the system operates properly. Consult instructions for your proces- 
sor before proceeding with any of these procedures. 

2. Filesaves 

Unless you make frequent copies of the file systems, a major system 
crash could devastate your user community. The user files could be 
destroyed or become inaccessible. 

You should take daily filesaves. Should the system crash and lose files, 
then, at most, only a day's work will be lost. If your last filesave (or 
backup) was a week ago, then even after restoring the file any changes 
made since that backup will be lost. 

There are two ways you can do filesaves: by disk and by tape. Most 
sites use volcopy to save files. See volcopy in the UniPlus^ System V 
Administrator's Manual for more information on the available options 
and use this command. You should normally do your file saving while 
in single-user mode, with the file system unmounted, to preclude any 
file system activity and subsequent damage on the saved copy. Also, to 
ensure system buff"ers are flushed and file systems are up to date, exe- 
cute the sync command before filesaves. 

Normally the filesave procedure is automated by the system administra- 
tor. You or your administrator may have created a shell script to per- 
form the filesave as part of your site's local operation. Daily filesaves 
usually are made on disk; whereas, a weekly filesave would be more 
efficiently made on tape. Tape saves are necessary for long-term 
storage or for regular saves if you do not have a spare disk. Tapes may 
be previously labeled, or may be labeled by the volcopy command. 
You or your administrator may have created separate shell scripts for 
disk and tape saves (incorporating the procedures that follow). 



6-1 



DUTIES 

You must have at least two disks, one of them a spare, for the follow- 
ing procedures. For ease of mapping, file systems are normally saved 
in the same partitions on the backup disk as they exist on the working 
disk. This is imperative if you ever need to boot from a backup version 
of root. The root file system must reside on partition a of the disk. 

2.1 Saving the Root File System on Disk 

In this example, the root file system on disk will be saved on disk 1. 

1 . Connect the disk to contain the filesave as disk 1 . 

2. Enter the commands: 

# sync 

# fsck /dev/wOa 

# volcopy root /dev/rwOa S3B001 /dev/rw/a S3B002 

to copy the root file system from disk partition a to disk 3 parti- 
tion 0. The following messages should appear: The following 
messages should appear: 

From: /dev/rwOa, to: /dev/rw/a? (DEL if wrong) 
END: 23000 blocks. 

# 

If the from and file systems are correct, wait for the prompt; oth- 
erwise, press the DELETE key to abort the copy. 

3. Do step 3 for all the partitions of the disk to copy. 

4. Disconnect and remove disk 1. 

In the above procedure, fsck in step 3 asks you to concur with any 
repairs necessary before attempting them. If you respond no, no action 
will be taken and fsck will continue. Also, volcopy verifies the label 
information on the to and from file system (for example, pack number, 
file system name, date last modified). You will be asked to override 
inconsistencies before the copy proceeds. For example: 



6-2 



DUTIES 

# volcopy root /dev/rwOa pOOOl /dev/rw/a p0105 

arg.(p0105) doesn't agree with to vol.O 

Type 'y' to override: y 

warning! from fs(root) differs from to fsO 

Type 'y' to override: y 

From: /dev/rwOa, to: /dev/rw/a? (DEL if wrong) 

END: 23000 blocks. 

# 

Note: In this example, the to partition is unlabeled, as indicated by the 
null volume and file system fields. For more information see volcopy 
in the UniPlus^ System V Administrator's Manual. 

1.1 Saving the User File System on Disk 

In this next example, the usr file system, on partition 6 of disk 0, will 
be saved on disk 1, volume p0603. 

1. Connect the disk to contain the file-save on disk 1. 

2. Enter the commands: 

# sync 

# umount /dev/wOb 

# fsck /dev/rwOb 

# volcopy usr /dev/rwOb pOOOl /dev/rw/b p0603 

to copy the usr file system from disk partition b to disk 1 parti- 
tion 2. The following messages should appear: 

From: /dev/rwob, to: /dev/rw/b? (DEL if wrong) 
END: 23000 blocks. 

# 

If the from and to file systems are correct, wait for the prompt; 
otherwise, press the DELETE key to abort the copy. 

3. Do step 3 for all the partitions of the disk to copy. 

4. Disconnect and remove the disk. 

2.3 Saving the User File System on Tape 

In this example, the usr file system is saved on tape volume tOOOl, 
mounted on transport 0. The labelit command is used to label the tape 



6-3 



DUTIES 

before the copy. You should place an external paper label on the out- 
side of the reel carrying the same information as is written in the tape 
header label. The external label should also indicate the sequence 
number of the tape if it is from a set (multi-reel volume) for the file 
system. Note the use of the -n option to labelit. Unless this option 
is used on an unlabeled tape, the program will scan the entire reel look- 
ing for a label to change before it rewinds and labels the beginning. 
This can be very time-consuming on 2400-foot reels. 

You can store approximately 65,000 blocks of a file system on a 2400- 
foot tape using volcopy and recording at 1600 bpi. You may specify the 
size and type of tape in the volcopy command, or you can let the sys- 
tem prompt for the information as shown. In the example that follows, 
the file system requires two reels. Although this example uses only one 
drive, you can have both reels mounted on different drives. In that 
case, when the first has finished, you would simply enter the name of 
the second drive when asked. 

1. Load the tape in transport 0, and label it: 

# labelit /dev/rmtO usr tOOOl -n 

Skipping label check! 

NEW fsname = usr, NEW volume = tOOOl - DEL if wrong!! 

# 

2. Enter the following commands: 



6-4 



DUTIES 

# sync 

# umount /dev/wOb 

# fsck — y /dev/rwOb 

# volcopy usr /dev/rwOb pOOOl /dev/rmtO tOOOl 

Enter size of reel in feet for < tOOOl > : 2400 

Reel tOOOl, 2400 feet, 1600 BPI 

You will need 2 reels. 

(The same size and density is expected for all reels) 

From: /dev/rwOb, to: /dev/rmtO? (DEL if wrong) 

Writing REEL 1 of 2, VOL = tOOOl 

Changing drives? (RETURN if no, /dev/rmt_ if yes): RETURN 

Mount tape 2 

Type volume-ID when ready: t0002 
Cannot read header (This tape has not been labeled!) 
Type y to override: y 
Volume is < garbage> , not < t0002> . 
Want to override ?y 

Writing REEL 2 of 2, VOL = t0002 
END: 90000 blocks. 

# 

3. File Restores 

3.1 Restoring from Disk 

When a request is made to restore a file from a backup disk, you 
should first locate that disk and determine on which partition the 
requested file system resides. Then at the console terminal, log in to 
the system as root and proceed as the example illustrates. Following is 
the procedure for restoring the file /usr/adm/acct/sum/tacct from a 
previous backup disk. For this example, disk 1 is the backup disk and 
/usr is on partition of the disk. 

1. Connect the disk as disk 1. 

2. Enter the command: 

# mount /dev/w/b /bck — r 

This will mount the backup file system as /bck read-only. The 



6-5 



DUTIES 

following message should appear: 

WARNING!! - mounting <usr> as </bck> 

3. Enter the command: 

# Is — 1 /bck/adm/acct/sum/tacct 

This will verify the existence of the file and the identity of the 
owner. The following output will appear: 

-rw-rw-r-- 1 adm bin 1932 Aug 9 14:27 /bck/adm/acct/sum/tacct 

4. Enter the command: 

# cp /bck/adm/acct/sum/tacct /usr/adm/acct/sum/tacct 
to copy the file from the backup to the specified place. 

5. Enter the command: 

# chown adm /usr/adm/acct/sum/tacct 
to change the owner of the file. 

6. Enter the command: 

# umount /dev/w/b 

This will unmount the backup file system. 

7. Disconnect and remove the backup disk. 

When you perform a file restore, it is usually a good practice to mail a 
message to the user asking for the restore when you are finished. Also, 
to avoid confusion, your message should refer to the file using a full 
pathname. The procedure for this is: 

# mail user 

I have restored the file /usr/adm/acct/sum/tacct 

from Friday's backup. 

your initials 
# 

3.2 Restoring from Tape 

If the file does not exist on any of the backup disks or if your installa- 
tion does not perform disk filesaves, then you will have to recover the 
file from a tape save. It is assumed that you do your tape saves in the 
same manner as disk saves, that is, with volcopy. Filesaves are 



6-6 



DUTIES 

discussed earlier in this chapter. To restore a file from tape, you must 
place the whole file system on a spare partition of the disk. The backup 
tape version can then be accessed in the same way as a disk save. For 
this example, it is assumed that there are two small file systems stored 
on a single tape and that the usr file system is the second file on the 
tape. Also, it is assumed that partition e of disk is a spare partition 
on that disk. The tape drive is already in service. 

1. Mount tape on tape drive 0. 

2. Enter the command: 

# echo < /dev/mtO 

This will space past the first file on the tape, with no rewind. 

3. Enter: 

# volcopy usr /dev/mtO t0004 dev/rwOe S3B003 

This will copy the file system from tape to the spare disk partition. 
The following messages should appear: 

From: /dev/mtO, to: /dev/rwOe? (DEL if wrong) 
END: 90000 blocks. 

4. Enter the command: 

# mount /dev/wOe /bck — r 

This will mount the backup partition. The following message 
should appear on the screen: 

WARNING!! - mounting: <usr> as </bck> 

5. Enter the command: 

Is —1 /bck/adm/acct/sum/tacct 

This will verify the existence of the file and identify the owner. 
The following output will appear: 

-rw-rw-r-- 1 adm bin 1932 Aug 9 14:27 /bck/adm/acct/sum/tacct 

6. Enter: 

cp /bck/adm/acct/sum/tacct /usr/adm/acct/sum/tacct 

This will copy the file to the specified place. 



6-7 



DUTIES 

7. Enter the command: 

chown adm /usr/adm/acct/sum/tacct 

to change the owner of the file. 

8. Enter the command: 

umount /dev/wOa 
This will unmount the spare partition. 

Sometimes a file system is so large it requires more than one tape to 
store the contents. In this situation, you follow the same procedure to 
restore a file as in the previous example. The volcopy command 
prompts you for additional reels when necessary. In this example, the 
second reel has the wrong label. The y response overrides the incon- 
sistency and the reel is read anyway. 

1. Mount tape on tape drive 0. 

2. Enter: 

volcopy -bpil600 -feet2400 usr /dev/rmtO t0004 dev/rwOe S3B003 

This will copy the file system from tape to the spare disk partition. 
The following messages should appear: 



6-8 



DUTIES 

Reel 1, 2400 feet, 1600 BPI 

From: /dev/rmtO, to: /dev/rwOe? (DEL if wrong) 

Reading REEL 1 of 3, VOL = 1 

Changing drives? (RETURN if no, /dev/rmt_ if yes): RETURN 

Mount tape 2 

Type volume-ID when ready: 2 

Volume is <1>, not <2>. 

Want to override? y 

Reading REEL 2 of 3, VOL = 1 
Fri Jul29 12:00:02 EDT 1983 

Changing drives? (RETURN if no, /dev/rmt_ if yes): RETURN 

Mount tape 3 

Type volume-ID when ready: 3 

Reading REEL 3 of 3, VOL = 3 
END: 90000 blocks. 

3. Enter the command: 

mount /dev/wOe /bck — r 

This will mount the backup partition. The following message 
should appear on the screen: 

WARNING!! - mounting: <usr> as </bck> 

4. Enter the command: 

Is —1 /bck/adm/acct/sum/tacct 

This will verify the existence of the file and identify the owner. 
The following output will appear: 

-rw-rw-r— 1 adm bin 1932 Aug 9 14:27 /bck/adm/acct/sum/tacct 

5. Enter: 

cp /bck/adm/acct/sum/tacct /usr/adm/acct/sum/tacct 
This will copy the file to the specified place. 

6. Enter the command: 

chown adm /usr/adm/acct/sum/tacct 
to change the owner of the file. 



6-9 



DUTIES 

7. Enter the command: 

umount /dev/wOe 
This will unmount the spare partition. 

4. Message of the Day 

When a user logs into the system, part of the login procedure prints 
out a message of the day. This message can contain several lines of 
useful information concerning scheduled down-time for hardware 
preventive maintenance (PM) , clean-up messages for space-low file sys- 
tems, or any other useful warnings. The trick to maintaining this file is 
to keep it short and to the point. A user does not want to wait ten 
minutes while eloquent and wordy dialogue is spewed from the terminal 
before he or she can begin working. 

The contents of this message are stored in the file /etc/motd. You may 
change the contents of this file by using the UniPlus"^ system text edi- 
tor. See ed or vi in the UniPlus^ System V User's Manual. A sample 
of adding and deleting a line from this file is shown below. 

# ed /etc/motd 
26 

P 

9/23: Reboot at 5pm today, 

d 

a 

9/24: Down for PM 1700-2100 on 9/30. 



w 

37 

q 

# 



You can also remove the contents of the entire file (do not remove the 
file itself; it needs to exist so the login process can read it) by: 

# cp /dev/nuU /etc/motd 

# 



6-10 



DUTIES 

5. System Shutdown 

You will perform three distinct steps when bringing down your 
UniPlus''' system. These steps must be performed in the indicated 
order, although it is not necessary to bring the system completely down 
for certain maintenance operations. For example, preventive mainte- 
nance (such as filesaves) must be done while in single-user mode 
without halting the UniPlus"^ system. Whereas, repairing a hard fault 
would necessitate removing power completely. You should never 
remove power from a piece of equipment that is in service, and 
definitely do not power down the system until the UniPlus"'' operating 
system has been halted. To bring down the system: 



• 



Run the shutdown program (changes a multiuser system to 
single-user mode). 

• Halt the UniPlus"^ program (the operating system). 

• Remove power. 

5.1 Shutdown Program 

Whenever the system must be shut down, such as for filesaves or a 
reboot, you should run the program /etc/shutdown. The shutdown 
procedure is designed to gracefully turn off all processes and bring the 
system back to single-user state with all buffers flushed. 

You must be in the root directory (/) to use the shutdown program. 
You may specify the amount of grace period between sending a warning 
message out and actually shutting down. This grace period is the 
number of seconds of delay. For example, specifying a grace period of 
300 will result in a 5-minute delay. You may also send your own mes- 
sage. A default message is sent to all logged-in users if you don't type 
your own. The following printout is an example of a typical shutdown 
sequence. Enter the following: 

#cd/ 

# shutdown 300 



Your shutdown procedure may vary slightly from the following, 
depending on how it is set up in your system. The shutdown script may 
be modified ator according to local procedures. A typical output is as 
follows: 



6-11 



DUTIES 

SHUTDOWN PROGRAM 

Thu Sep 1 18:51:58 EST 1983 

Do you want to send your own message? (y or n): y 
Type your message followed by <ctrl>d.... 

System coming down for fllesaves! 

Please log off. 

<ctrl>d 

System coming down for filesaves! 

Please log off. 

(waits for 5 minutes) 

SYSTEM BEING BROUGHT DOWN NOW ! ! ! 

Busy out (push down) the appropriate 
phone lines for this system. 

Do you want to continue? (y or n): y 

Process accounting stopped. 

Error logging stopped. 

All currently running processes will now be killed. 

Wait for 'INIT: SINGLE USER MODE' before halting. 

If you executed the shutdown program while in single-user mode, 
(which is neither useful nor recommended) the system will not respond 
with the 'INIT' message above. 

At the completion of this program you can either halt the system (and 
reboot if necessary), power down, start the filesave routine or other 
preventive maintenance, or bring the system back to multiuser mode. 
To go to multiuser, type in telinit 2. See the Chapter 5, SINGLE 
USER AND MULTIUSER MODE, for more information on changing 
run level. 

6. System Crash Recovery 

An operating system is considered to have crashed when it halts itself 
without being asked to. The reason for the halt is often unknown and 
can be hardware failure or software related. It is important, for obvious 



6-12 



DUTIES 

reasons, to determine the nature of the crash so that it will not happen 
again. Note any messages that appear on the console, and any per- 
tinent information on the processing that was going on at the time the 
crash occurred. 



6-13 



Chapter?: SYSTEM ACCOUNTING 

CONTENTS 

1. Introduction 1 

2. General 1 

3. Files and Directories 1 

4. Daily Operation 2 

5. Setting Up the Accounting System 3 

6. RUNACCT 4 

7. Recovering From Failure 6 

8. Restarting RUNACCT 8 

9. Fixing Corrupted Files 8 

9.1 Fixing WTMP Errors 8 

9.2 Fixing TACCT Errors 9 

10. Updating Holidays 9 

11. Daily Reports 10 

11.1 Daily Report 10 

11.2 Daily Usage Report 11 

11.3 Daily Command and Monthly Total Command 
Summaries 13 

11.4 Last Login 14 

12. Summary 15 

LIST OF FIGURES 

Figure 7.1. Directory Structure of the "adm" Login 2 

LIST OF TABLES 

TABLE 7.1. Files in the /usr/adm directory 16 

TABLE 7.2. Files in the /usr/adm/acct/fiscal directory .... 16 



1 - 



TABLE 7.3. Files in the /usr/adm/acct/nite directory (Page 1 of 

2) 17 

TABLE 7.3. Files in the /usr/adm/acct/nite directory (Page 2 of 

2) 18 

TABLE 7.4. Files in the /usr/adm/acct/sum directory .... 19 



11 - 



Chapter 7 
SYSTEM ACCOUNTING 



1. Introduction 

The UniPlus+ system accounting provides methods to collect per- 
process resource utilization data, record connect sessions, monitor disk 
utilization, and charge fees to specific logins. A set of C language pro- 
grams and shell procedures is provided to reduce this accounting data 
into summary files and reports. This chapter describes the structure, 
implementation, and management of this accounting system, as well as 
a discussion of the reports generated and the meaning of the columnar 
data. 

2. General 

The following list is a synopsis of the actions of the accounting system: 

• At process termination, the UniPlus"^ system kernel writes one 
record per process in /usr/adm/pacct in the form of acct.h. 

• The login and init programs record connect sessions by writing 
records into /etc/wtmp. Date changes, reboots, and shutdowns (via 
acctwtmp) are also recorded in this file. 

• The disk utilization program acctdusg and diskusg break down 
disk usage by login. 

• Fees for file restores, etc., can be charged to specific logins with 
the chargefee shell procedure. 

• Each day the runacct shell procedure is executed via cron to 
reduce accounting data and produce summary files and reports. 

• The monacct procedure can be executed on a monthly or fiscal 
period basis. It saves and restarts summary files, generates a 
report, and cleans up the sum directory. These saved summary 
files could be used to charge users for UniPlus"^ system usage. 

3. Files and Directories 

The /usr/lib/acct directory contains all of the C language programs and 
shell procedures necessary to run the accounting system. The adm 
login (currently user ID of 4) is used by the accounting system and has 
the login directory structure shown in Figure 7.1. 



7-1 



ACCOUNTING 



/usr/adm 
acct 



nite sum fiscal 

Figure 7.1. Directory Structure of the "adm" Login 

The /usr/adm directory contains the active data collection files. (For a 
complete explanation of the files used by the accounting system, see 
the table at the end of this section.) The nite directory contains files 
that are re-used daily by the runacct procedure. The sum directory 
contains the cumulative summary files updated by runacct. The fiscal 
directory contains periodic summary files created by monacct. 

4. Daily Operation 

When the UniPlus'*' system is switched into multiuser mode, 
/usrAib/acct/startup is executed which does the following: 

1. The acctwtmp program adds a "boot" record to /etc/wtmp. This 
record is signified by using the system name as the login name in 
the wtmp record. 

2. Process accounting is started via turnacct. Turnacct on executes 
the accton program with the argument /usr/adm/pacct. 

3. The remove shell procedure is executed to clean up the saved 
pacct and wtmp files left in the sum directory by runacct. 



The ckpacct procedure is run via cron every hour of the day to check 
the size of /usr/adm/pacct. If the file grows past 1000 blocks (default), 
turnacct switch is executed. The advantage of having several smaller 
pacct files becomes apparent when trying to restart runacct after a 
failure processing these records. 

The chargefee program can be used to bill users for file restores, etc. It 
adds records to /usr/adm/fee which are picked up and processed by the 
next execution of runacct and merged into the total accounting records. 

Runacct is executed via cron each night. It processes the active 
accounting files, /usr/adm/pacct^ /etc/wtmp, /usr/adm/acct/nite/disktacct, and 



7-2 



ACCOUNTING 

/usr/adm/fee. It produces command summaries and usage summaries by 
login. 

When the system is shut down using shutdown, the shutacct shell pro- 
cedure is executed. It writes a shutdown reason record into /etc/wtmp 
and turns process accounting off. 

After the first reboot each morning, the computer operator should exe- 
cute /usr/lib/acct/pr daily to print the previous day's accounting report. 

5. Setting Up the Accounting System 

In order to automate the operation of this accounting system, several 
things need to be done: 

1 . If not already present, add this line to the /etc/rc file in the state 2 
section: 

/bin/su — adm — c /usr/lib/acct/ startup 

2. If not already present, add this line to /etc/shutdown to turn off the 
accounting before the system is brought down: 

/ usr/ lib/acct/ shutacct 

3. For most installations, the following three entries should be made 
in /usr/spool/cron/crontab/adm so that cron will automatically run 
the daily accounting. 

4**1-6 /usr/lib/acct/runacct 2 > /usr/adm/acct/nite/fd21og 
2**4 /usr/hb/acct/dodisk 
5 * * * * /usr/lib/acct/ckpacct 

4. To facilitate monthly merging of accounting data, the following 
entry in /usr/spool/cron/crontab/adm will allow monacct to clean up 
all daily reports and daily total accounting files and deposit one 
monthly total report and one monthly total accounting file in the 
fiscal directory. 

15 5 1 * * /usr/ lib/acct/ monacct 

The above entry takes advantage of the default action of monacct 
that uses the current month's date as the suffix for the file names. 
Notice that the entry is executed at such a time as to allow 
runacct sufficient time to complete. This will, on the first day of 
each month, create monthly accounting files with the entire 



7-3 



ACCOUNTING 



month's data. 
5, The PATH shell variable should be set in /usr/adm/. profile to: 
PATH = /usr/lib/acct:/bin:/usr/bin 

6. RUNACCT 

Runacct is the main daily accounting shell procedure. It is normally 
initiated via cron during nonprime time hours. Runacct processes con- 
nect, fee, disk, and process accounting files. It also prepares daily and 
cumulative summary files for use by prdaily or for billing purposes. 
The following files produced by runacct are of particular interest: 



nite/lineuse 



nite/daytacct 
sum/tacct 

sum/daycms 
sum/cms 
sum/loginlog 
sum/rprtMMDD 



Produced by acctcon, reads the mmp file, and 
produces usage statistics for each terminal line on 
the system. This report is especially useful for 
detecting bad lines. If the ratio between the 
number of logoffs to logins exceeds about 3/1, 
there is a good possibility that the line is failing. 

This file is the total accounting file for the previ- 
ous day in tacct.h format. 

This file is the accumulation of each day's 
nite/daytacct and can be used for billing purposes. 
It is restarted each month or fiscal period by the 
monacct procedure. 

Produced by the acctcms program. It contains 
the daily command summary. The ASCII ver- 
sion of this file is nite/daycms. 

The accumulation of each day's command sum- 
maries. It is restarted by the execution of 
monacct. The ASCII version is nite/cms. 

Produced by the lastlogin shell procedure. It 
maintains a record of the last time each login was 
used. 

Each execution of runacct saves a copy of the 
daily report that can be printed by prdaily. 



Runacct takes care not to damage files in the event of errors. A series 
of protection mechanisms are used that attempt to recognize an error, 
provide intelligent diagnostics, and terminate processing in such a way 



7-4 



ACCOUNTING 

that runacct can be restarted with minimal intervention. It records its 
progress by writing descriptive messages into the file active. (Files used 
by runacct are assumed to be in the nite directory unless otherwise 
noted.) All diagnostics output during the execution of runacct is writ- 
ten into fdllog. Runacct will complain if the files lock and lockl exist 
when invoked. The lastdate file contains the month and day runacct 
was last invoked and is used to prevent more than one execution per 
day. If runacct detects an error, a message is written to /dev/console^ 
mail is sent to root and adm., locks are removed, diagnostic files are 
saved, and execution is terminated. 



In order to allow runacct to be restartable, processing is broken down 
into separate reentrant states. A file is used to remember the last state 
completed. When each state completes, statefile is updated to reflect 
the next state. After processing for the state is complete, statefile is 
read and the next state is processed. When runacct reaches the 
CLEANUP state, it removes the locks and terminates. States are exe- 
cuted as follows: 



SETUP 



WTMPFIX 



CONNECTl 



C0NNECT2 



PROCESS 



The command turnacct switch is executed. The 
process accounting files, /usr/adm/pacct?^ are 
moved to /usr/adm/Spacct?.MMDD. The /etc/wtmp 
file is moved to /usr/adm/acct/nite/wtmp.MMDD 
with the current time added on the end. 

The wtmp file in the nite directory is checked for 
correctness by the wtmpflx program. Some date 
changes will cause acctconl to fail, so wtmpflx 
attempts to adjust the time stamps in the wtmp 
file if a date change record appears. 

Connect session records are written to ctmp in 
the form of ctmp.h. The lineuse file is created, 
and the reboots file is created showing all of the 
boot records found in the wtmp file. 

Ctmp is converted to ctacct.MMDD which are 
connect accounting records. (Accounting records 
are in tacct.h format.) 

The acctprcl and acctprc2 programs are used to 
convert the process accounting files, 
/usr/adm/Spacct?.MMDD^ into total accounting 
records in ptacct 7.MMDD. The Spacct and ptacct 



7-5 



ACCOUNTING 



MERGE 



FEES 



DISK 



MERGETACCT 



CMS 

USEREXIT 
CLEANUP 



files are correlated by number so that if runacct 
fails the unnecessary reprocessing of Spacct files 
will not occur. One precaution should be noted; 
when restarting runacct in this state, remove the 
last placet file because it will not be complete. 

Merge the process accounting records with the 
connect accounting records to form daytacct. 

Merge in any ASCII tacct records from the file 
fee into daytacct. 

On the day after the dodisk procedure runs, 
merge disktacct with daytacct. 

Merge daytacct with sum/tacct, the cumulative 
total accounting file. Each day, daytacct is saved 
in sum/tacctMMDD, so that sum/tacct can be 
recreated in the event it becomes corrupted or 
lost. 

Merge in today's command summary with the 
cumulative command summary file sum/cms. 
Produce ASCII and internal format command 
summary files. 



Any installation dependent (local) 
programs can be included here. 



accounting 



Clean up temporary files, run prdaily and save its 
output in sum/rprtMMDD, remove the locks, then 
exit. 



7. Recovering From Failure 

The runacct procedure can fail for a variety of reasons; usually due to a 
system crash, /usr running out of space, or a corrupted wtmp file. If the 
activeMMDD file exists, check it first for error messages. If the active 
file and lock files exist, check fd2log for any mysterious messages. The 
following are error messages produced by runacct and the recom- 
mended recovery actions: 



7-6 



ACCOUNTING 



ERROR: locks found, run aborted 

The files lock and lockl were found. These files must be removed 
before runacct can restart. 



ERROR: acctg already run for date : check 
/usr/adm/acct/nite/lastdate 

The date in lastdate and today's date are the same. Remove last- 
date. 



ERROR: turnacct switch returned rc= .'' 

Check the integrity of turnacct and accton. The accton program 
must be owned by root and have the setuid bit set. 

ERROR: Spacct .^.MM/)D already exists 

File setups probably already run. Check status of files, then run 
setups manually. 

ERROR: /usr/adm/acct/nite/wtmp. MMDZ) already exists, run setup 
manually 

Self-explanatory. 

ERROR: wtmpfix errors see /usr/adm/acct/nite/wtmperror 

Wtmpflx detected a corrupted wtmp file. Use fwtmp to correct the 
corrupted file. 

ERROR: connect acctg failed: check /usr/adm/acct/nite/log 

The acctconl program encountered a bad wtmp file. Use fwtmp to 
correct the bad file. 



7-7 



ACCOUNTING 

ERROR: Invalid state, check /usr/adm/acct/nite/active 

The file statefile is probably corrupted. Check statefile and read 
active before restarting. 

8. Restarting RUNACCT 

Runacct called without arguments assumes that this is the first invoca- 
tion of the day. The argument MMDD is necessary if runacct is being 
restarted and specifies the month and day for which runacct will rerun 
the accounting. The entry point for processing is based on the contents 
of statefile. To override statefile, include the desired state on the com- 
mand line. For example: 

To start runacct: 

nohup runacct 2> /usr/adm/acct/nite/fd21og& 

To restart runacct: 

nohup runacct 0601 2> /usr/adm/acct/nite/fd21og& 

To restart runacct at a specific state: 

nohup runacct 0601 WTMPFIX 2> /usr/adm/acct/nite/fd21og& 

9. Fixing Corrupted Files 

Unfortunately, this accounting system is not entirely foolproof. Occa- 
sionally, a file will become corrupted or lost. Some of the files can sim- 
ply be ignored or restored from the file save backup. However, certain 
files must be fixed in order to maintain the integrity of the accounting 
system. 

9.1 Fixing WIMP Errors 

The wtmp files seem to cause the most problems in the day-to-day 
operation of the accounting system. When the date is changed and the 
UniPlus"*" system is in multiuser mode, a set of date change records is 
written into /etc/wtmp. The wtmpfix program is designed to adjust the 
time stamps in the wtmp records when a date change is encountered. 
However, some combinations of date changes and reboots will slip 



7-8 



ACCOUNTING 

through wtmpfix and cause acctconl to fail. The following steps show 
how to patch up a wtmp file. 

cd /usr/adm/acct/nite 

fwtmp < wtmp. MMDD > xwtmp 

ed xwtmp 

delete corrupted records or 

delete all records from beginning up to the date change 
fwtmp — ic < xwtmp > wimp. MMDD 

If the wtmp file is beyond repair, create a null wtmp file. This will 
prevent any charging of connect time. Acctprcl will not be able to 
determine which login owned a particular process, but it will be charged 
to the login that is first in the password file for that user id. 

9.2 Fixing TACCT Errors 

If the installation is using the accounting system to charge users for sys- 
tem resources, the integrity of sum/tacct is quite important. Occasion- 
ally, mysterious tacct records will appear with negative numbers, dupli- 
cate user IDs, or a user ID of 65,535. First check sum/tacctprev with 
prtacct. If it looks all right, the latest sum/tacct. MMDD should be 
patched up, then sum/tacct recreated. A simple patchup procedure 
would be: 

cd /usr/adm/acct/sum 

acctmerg — v < tacct MMDD > xtacct 

ed xtacct 

remove the bad records 

write duplicate uid records to another file 
acctmerg — i < xtacct > tacct. MMDD 
acctmerg tacctprev < tacct. MMDD > tacct 

Remember that the monacct procedure removes all the tacct. MMDD 
files; therefore, sum/tacct can be recreated by merging these files 
together. 

10. Updating Holidays 

The file /usr/lib/acct/holidays contains the prime/nonprime table for the 
accounting system. The table should be edited to reflect your location's 
holiday schedule for the year. The format is composed of three types 
of entries: 



7-9 



ACCOUNTING 

1. Comment Lines'. Comment lines may appear anywhere in the file 
as long as the first character in the line is an asterisk. 

2. Year Designation Line: This line should be the first data line (non- 
comment line) in the file and must appear only once. The line 
consists of three fields of four digits each (leading white space is 
ignored). For example, to specify the year as 1985, prime time at 
9:00 a.m., and nonprime time at 4:30 p.m., the following entry 
would be appropriate: 

1985 0900 1630 

A special condition allowed for in the time field is that the time 
2400 is automatically converted to 0000. 

3. Company Holidays Lines: These entries follow the year designation 
line and have the following general format: 

day-of-year Month Day Description of Holiday 

The day-of-year field is a number in the range of 1 through 366 
indicating the day for the corresponding holiday (leading white 
space is ignored). The other three fields are actually commentary 
and are not currently used by other programs. 

11. Daily Reports 

Runacet generates five basic reports upon each invocation. They cover 
the areas of connect accounting, usage by person on a daily basis, com- 
mand usage reported by daily and monthly totals, and a report of the 
last time users were logged in. 

The following paragraphs describe the reports and the meanings of their 
tabulated data. 

11.1 Daily Report 

In the first part of the report, the from/to banner should alert the 
administrator to the period reported on. The times are the time the last 
accounting report was generated until the time the current accounting 
report was generated. . It is followed by a log of system reboots, shut- 
downs, power fail recoveries, and any other record dumped into 
/etc/wtmp by the acctwtmp program [see acct(lM) in the UniPlus^ Sys- 
tem V Administrator's Manual]. 



7-10 



ACCOUNTING 



The second part of the report is a breakdown of line utilization. The 
TOTAL DURATION tells how long the system was in multiuser state 
(able to be accessed through the terminal lines). The columns are: 



LINE 
MINUTES 

PERCENT 

# SESS 

# ON 



#OFF 



The terminal line or access port. 

The total number of minutes that line was in use 
during the accounting period. 

The total number of MINUTES the line was in 
use divided into the TOTAL DURATION. 

The number of times this port was accessed for a 
login (1) session. 

This column does not have much meaning any 
more. It used to give the number of times that 
the port was used to log a user on; but since 
login (1) can no longer be executed explicitly to 
log in a new user, this column should be identical 
with SESS. 

This column reflects not just the number of times 
a user logged off but also any interrupts that 
occur on that line. Generally, interrupts occur 
on a port when the getty(lM) is first invoked 
when the system is brought to multiuser state. 
Where this column does come into play is when 
the # OFF exceeds the # ON by a large factor. 
This usually indicates that the multiplexer, 
modem, or cable is going bad, or there is a bad 
connection somewhere. The most common 
cause of this is an unconnected cable dangling 
from the multiplexer. 



During real time, /etc/wtmp should be monitored as this is the file that 
the connect accounting is geared from. If it grows rapidly, execute 
acctconl to see which tty line is the noisest. If the interrupting is 
occurring at a furious rate, general system performance will be eff'ected. 

11.2 Daily Usage Report 

This report gives a by-user breakdown of system resource utilization. 
Its data consists of: 



7-11 



ACCOUNTING 



UID 

LOGIN NAME 

CPU (MINS) 



KCORE-MINS 



CONNECT (MINS) 



DISK BLOCKS 



# OF PROCS 



The user ID. 

The login name of the user; there can be 
more than one login name for a single user 
ID, this identifies which one. 

This represents the amount of time the user's 
process used the central processing unit. This 
category is broken down into PRIME and 
NPRIME (nonprime) utilization. The 
accounting system's idea of this breakdown is 
located in the /usrAib/acct/holidays file. As 
delivered, prime time is defined to be 0900 
through 1700 hours. 

This represents a cumulative measure of the 
amount of memory a process uses while run- 
ning. The amount shown reflects kilobyte 
segments of memory used per minute. This 
measurement is also broken down into 
PRIME and NPRIME amounts. 

This identifies "Real Time" used. What this 
column really identifies is the amount of time 
that a user was logged into the system. If this 
time is rather high and the column "# OF 
PROCS" is low, this user is what is called a 
"line hog". That is, this person logs in first 
thing in the morning and does not hardly 
touch the terminal the rest of the day. Watch 
out for these kinds of users. This column is 
also subdivided into PRIME and NPRIME 
utilization. 

When the disk accounting programs have 
been run, the output is merged into the total 
accounting record (tacct.h) and shows up in 
this column. This disk accounting is accom- 
plished by the program acctdusg. 

This column reflects the number of processes 
that was invoked by the user. This is a good 
column to watch for large numbers indicating 
that a user may have a shell procedure that 
runs amock. 



7-12 



ACCOUNTING 

# OF SESS This is how many times the user logged onto 

the system. 

# DISK SAMPLES This indicates how many times the disk 

accounting was run to obtain the average 
number of DISK BLOCKS listed earlier. 

FEE An often unused field in the total accounting 

record, the FEE field represents the total 
accumulation of widgets charged against the 
user by the chargefee shell procedure [see 
acctsh(lM)]. The chargefee procedure is 
used to levy charges against a user for special 
services performed such as file restores, etc. 

11.3 Daily Command and Monthly Total Command Summaries 

These two reports are virtually the same except that the Daily Com- 
mand Summary only reports on the current accounting period while the 
Monthly Total Command Summary tells the story for the start of the 
fiscal period to the current date. In other words, the monthly report 
reflects the data accumulated since the last invocation of monacct. 

The data included in these reports gives an administrator an idea as to 
the heaviest used commands and, based on those commands' charac- 
teristics of system resource utilization, a hint as to what to weigh more 
heavily when system tuning. 

These reports are sorted by TOTAL KCOREMIN, which is an arbitrary 
yardstick but often a good one for calculating "drain" on a system. 

COMMAND NAME This is the name of the command. Unfor- 
tunately, all shell procedures are lumped 
together under the name sh since only object 
modules are reported by the process account- 
ing system. The administrator should monitor 
the frequency of programs called a.out or core 
or any other name that does not seem quite 
right. Often people like to work on their 
favorite version of backgammon only they do 
not want everyone to know about it. Acctcom 
is also a good tool to use for determining who 
executed a suspiciously named command and 



7-13 



ACCOUNTING 



NUMBER CMOS 
TOTAL KCOREMIN 

TOTAL CPU-MIN 
TOTAL REAL-MIN 

MEAN SIZE-K 

MEAN CPU-MIN 
HOG FACTOR 



CHARS TRNSFD 



BLOCKS READ 



also if superuser privileges were used. 

This is the total number of invocations of this 
particular command. 

The total cumulative measurement of the 
amount of kilobyte segments of memory used 
by a process per minute of run time. 

The total processing time this program has 
accumulated. 

The total real-time (wall-clock) minutes this 
program has accumulated. This total is the 
actual "waited for" time as opposed to kick- 
ing off a process in the background. 

This is the mean of the TOTAL KCOREMIN 
over the number of invocations reflected by 
NUMBER CMOS. 

This is the mean derived between the 
NUMBER CMOS and TOTAL CPU-MIN. 

This is a relative measurement of the ratio of 
system availability to system utilization. It is 
computed by the formula 

(total CPU time) / (elapsed time) 

This gives a relative measure of the total 
available CPU time consumed by the process 
during its execution. 

This column, which may go negative, is a 
total count of the number of characters 
pushed around by the read (2) and write (2) 
system calls. 

A total count of the physical block reads and 
writes that a process performed. 



11.4 Last Login 

This report simply gives the date when a particular login was last used. 
This could be a good source for finding likely candidates for the 
archives or getting rid of unused logins and login directories. 



7-14 



ACCOUNTING 

12. Summary 

The UniPlus''" system accounting was designed from a system 
administrator's point of view. Every possible precaution has been taken 
to ensure that the system will run smoothly and without error. It is 
important to become familiar with the C programs and shell procedures. 
The manual pages should be studied, and it is advisable to keep a 
printed copy of the shell procedures handy. The accounting system 
should be easy to maintain, provide valuable information for the 
administrator, and provide accurate breakdowns of the usage of system 
resources for charging purposes. 



7-15 



ACCOUNTING 



TABLE 7.1. Files in the /usr/adm directory 



diskdiag 

dtmp 
fee 

pacct 

pacct? 

Spacct?.MMDD 



diagnostic output during the execution of disk 
accounting programs 

output from the acctdusg program 

output from the chargefee program, ASCII 
tacct records 

active process accounting file 

process accounting files switched via turnacct 

process accounting files for MMDD during 
execution of runacct 



cms/ 

fiscrpt? 
tacct? 



TABLE 7.2. Files in the /usr/adm/acct/fiscal directory 

total command summary file for fiscal ? in 
internal summary format 

report similar to prdaily for fiscal ? 

total accounting file for fiscal ? 



7-16 



ACCOUNTING 



TABLE 7.3. Files in the /usr/adm/acct/nite directory (Page 1 of 2) 



active 



cms 



ctacct.MMDD 
ctmp 

daycms 

daytacct 

disktacct 

fd2iog 



used by runacct to record progress and print 
warning and error messages, active MMDZ) 
same as active after runacct detects an error 

ASCII total command summary used by 
prdaily 

connect accounting records in tacct.h format 

output of acctconl program, connect session 
records in ctmp.h format 

ASCII daily command summary used by 
prdaily 

total accounting records for 1 day in tacct.h 
format 

disk accounting records in tacct.h format, 
created by dodisk procedure 

diagnostic output during execution of runacct 
(see cron entry) 



7-17 



ACCOUNTING 



TABLE 7.3. Files in the /usr/adm/acct/nite directory (Page 2 of 2) 



lastdate 

lock lockl 

lineuse 

log 

logMMDD 

reboots 

statefile 

tmpwtmp 

wtmperror 

wtmperrorMMDD 

wtmp.MMDD 



last day runacct executed in date +%m%d 
format 

used to control serial use of runacct 

tty line usage report used by prdaily 

diagnostic output from acctconl 

same as log after runacct detects an error 

contains beginning and ending dates from 
wtmp, and a listing of reboots 

used to record current state during execution 
of runacct 

wtmp file corrected by wtmpfix 

place for wtmpfix error messages 

same as wtmperror after runacct detects an 
error 

previous day's wtmp 
file 



7-18 



ACCOUNTING 



TABLE 7.4. Files in the /usr/adm/acct/sum directory 



cms 

cmsprev 
daycms 

loginlog 
pacct.MMDD 

rprtMMDD 

tacct 

tacctprev 

tacctMMDD 

wtmp.MMDD 



total command summary file for current fiscal 
in internal summary format 

command summary file without latest update 

command summary file for yesterday in inter- 
nal summary format 

created by lastlogin 

concatenated version of all pacct files for 
MMDD, removed after reboot by remove pro- 
cedure 

saved output of prdaily program 

cumulative total accounting file for current 
fiscal 

same as tacct without latest update 

total accounting file for MMDD 

saved copy of wtmp file for MMDD, removed 
after reboot by remove procedure 



7-19 



Chapters: FSCK: FILE SYSTEM CHECKING 



CONTENTS 

1. Introduction . 1 

2. General 1 

2.1 System Administrator Advice 2 

3. Update of the File System 2 

3.1 Superblock 2 

3.2 Inodes . 2 

3.3 Indirect Blocks 3 

3.4 Data Blocks 3 

3.5 First Free-List Block 3 

4. Corruption of the File System 3 

4.1 Improper System Shutdown and Startup 3 

4.2 Hardware Failure 4 

5. Detection and Correction of Corruption 4 

5.1 Superblock 4 

5.1.1 File System Size and Inode-List Size 5 

5.1.2 Free-Block List 5 

5.1.3 Free-Block Count 5 

5.1.4 Free-Inode Count 6 

5.2 Inodes 6 

5.2.1 Format and Type 6 

5.2.2 Link Count . 7 

5.2.3 Duplicate Blocks 7 

5.2.4 Bad Blocks 8 

5.2.5 Size Checks 8 

5.3 Indirect Blocks 9 

5.4 Data Blocks 9 

5.5 Free-List Blocks 10 

6. FSCK Error Conditions 10 

6.1 Conventions 10 

6.2 Initialization 11 

6.3 PHASE 1: CHECK BLOCKS AND SIZES 14 

6.4 PHASE IB: RESCAN FOR MORE DUPS 17 

6.5 PHASE 2: CHECK PATHNAMES 17 

6.6 PHASE 3: CHECK CONNECTIVITY 19 



6.7 PHASE 4: CHECK REFERENCE COUNTS .... 20 

6.8 PHASE 5: CHECK FREE LIST . 24 

6.9 PHASE 6: SALVAGE FREE LIST 26 

6.10 CLEANUP 26 



- 11 



Chapter 8 
FSCK: FILE SYSTEM CHECKING 



1. Introduction 

The File System Check Program (fsck) is an interactive file system 
check and repair program. Fsck uses the redundant structural informa- 
tion in the UniPlus"'' system file system to perform several consistency 
checks. If an inconsistency is detected, it is reported to the operator, 
who may elect to fix or ignore each inconsistency. These inconsisten- 
cies result from the permanent interruption of the file system updates, 
which are performed every time a file is modified. Fsck is frequently 
able to repair corrupted file systems using procedures based upon the 
order in which the UniPlus"^ system honors these file system update 
requests. 

The purpose of this chapter is to describe the normal updating of the 
file system, to discuss the possible causes of file system corruption, and 
to present the corrective actions implemented by fsck. Both the pro- 
gram and the interaction between the program and the operator are 
described. 

The fsck error conditions are listed in the last section of this chapter. 
The meanings of the various error conditions, possible responses, and 
related error conditions are explained. 

2. General 

When a UniPlus"*" operating system is brought up, a consistency check 
of the file systems should always be performed. This precautionary 
measure helps to ensure a reliable environment for file storage on disk. 
If an inconsistency is discovered, corrective action must be taken. 

The updating of the file system and file system corruption is described 
in this chapter. Finally, the set of heuristically sound corrective actions 
used by fsck are presented. 



8-1 



FSCK 



2.1 System Administrator Advice 

Remember that system buffers are 1024 bytes. When configuring the 
operating system, take into consideration that the same number of 
buffers as before will use more main memory. Weigh this against 
reducing the number of buffers, which reduces the cache hit ratio and 
degrades performance. 

3. Update of the File System 

Every working day hundreds of files are created, modified, and 
removed. Every time a file is modified, the UniPlus''' system performs 
a series of file system updates. These updates, when written on disk, 
yield a consistent file system. To understand what happens in the event 
of a permanent interruption in this sequence, it is important to under- 
stand the order in which the update requests were probably being 
honored. Knowing which pieces of information were probably written 
to the file system first, heuristic procedures can be developed to repair a 
corrupted file system. 

There are five types of file system updates. These involve the super- 
block, inodes, indirect blocks, data blocks (directories and files), and 
free-list blocks. 

3.1 Superb lock 

The superblock contains information about the size of the file system, 
the size of the inode list, part of the free-block list, the count of free 
blocks, the count of free inodes, and part of the free-inode list. 

The superblock of a mounted file system (the root file system is always 
mounted) is written to the file system whenever the file system is 
unmounted or a sync command is issued. 

3.2 Inodes 

An inode contains information about the type of inode (directory, data, 
or special) , the number of directory entries linked to the inode, the list 
of blocks claimed by the inode, and the size of the inode. 

An inode is written to the file system upon closure of the file associated 
with the inode. (All "in" core blocks are also written to the file system 
upon issue of a sync system call.) 



8-2 



FSCK 



3.3 Indirect Blocks 

There are three types of indirect blocks— single-indirect, double- 
indirect, and triple-indirect. A single-indirect block contains a list of 
some of the block numbers claimed by an inode. Each one of the 128 
entries in an indirect block is a data- block number. A double-indirect 
block contains a list of single-indirect block numbers. A triple-indirect 
block contains a list of double-indirect block numbers. 

Indirect blocks are written to the file system whenever they have been 
modified and released by the operating system. More precisely, they 
are queued for eventual writing. Physical I/O is deferred until the 
buffer is needed by the UNIX system or a sync command is issued. 

3.4 Data Blocks 

A data block may contain file information or directory entries. Each 
directory entry consists of a file name and an inode number. 

Data blocks are written to the file system whenever they have been 
modified and released by the operating system. 

3.5 First Free-List Block 

The superblock contains the first free-list block. The free-list blocks 
are a list of all blocks that are not allocated to the superblock, inodes, 
indirect blocks, or data blocks. Each free-list block contains a count of 
the number of entries in this free-list block, a pointer to the next free- 
list block, and a partial list of free blocks in the file system. 

Free-list blocks are written to the file system whenever they have been 
modified and released by the operating system. 

4. Corruption of the File System 

A file system can become corrupted in a variety of ways. Improper 
shutdown procedures and hardware failures are the most common. 

4.1 Improper System Shutdown and Startup 

File systems may become corrupted when proper shutdown procedures 
are not observed, e.g., forgetting to sync the system prior to halting the 
CPU, physically write-protecting a mounted file system, or taking a 



8-3 



FSCK 

mounted file system off-line. 

File systems may also become further corrupted by allowing a corrupted 
file system to be used (and, thus, to be modified further) can be disas- 
trous. 

4.2 Hardware Failure 

Any piece of hardware can fail at any time. Failures can be as subtle as 
a bad block on a disk platter or as blatant as a nonfunctional disk con- 
troller. 

5. Detection and Correction of Corruption 

A quiescent file system (an unmounted system and not being written 
on) may be checked for structural integrity by performing consistency 
checks on the redundant data intrinsic to a file system. The redundant 
data is either read from the file system or computed from other known 
values. A quiescent state is important during the checking of a file sys- 
tem because of the multipass nature of the fsck program. 

When an inconsistency is discovered, fsck reports the inconsistency for 
the operator to chose a corrective action. 

Discussed in this part are how to discover inconsistencies (and possible 
corrective actions) for the superblock, the inodes, the indirect blocks, 
the data blocks containing directory entries, and the free-list blocks. 
These corrective actions can be performed interactively by the fsck 
command under control of the operator. 

5.1 Superblock 

One of the most common corrupted items is the superblock. The 
superblock is prone to corruption because every change to the file 
system's blocks or inodes modifies the superblock. 

The superblock and its associated parts are most often corrupted when 
the computer is halted and the last command involving output to the 
file system was not a sync command. 



8-4 



FSCK 



The superblock can be checked for inconsistencies involving file system 
size, inode-list size, free-block list, free-block count, and the free-inode 
count. 

5.1.1 File System Size and Inode-List Size 

The file system size must be larger than the number of blocks used by 
the superblock and the number of blocks used by the list of inodes. 
The number of inodes must be less than 65,535. The file system size 
and inode-list size are critical pieces of information to the fsck pro- 
gram. While there is no way to actually check these sizes, fsck can 
check for them being within reasonable bounds. All other checks of 
the file system depend on the correctness of these sizes. 

5.1.2 Free-Block List 

The free-block list starts in the superblock and continues through the 
free-list blocks of the file system. Each free-list block can be checked 
for a list count out of range, for block numbers out of range, and for 
blocks already allocated within the file system. A check is made to see 
that all the blocks in the file system were found. 

The first free-block list is in the superblock. Fsck checks the list count 
for a value of less than or greater than 50. It also checks each block 
number for a value of less than the first data block in the file system or 
greater than the last block in the file system. Then it compares each 
block number to a list of already allocated blocks. If the free-list block 
pointer is nonzero, the next free-list block is read in and the process is 
repeated. 

When all the blocks have been accounted for, a check is made to see if 
the number of blocks used by the free-block list plus the number of 
blocks claimed by the inodes equals the total number of blocks in the 
file system. 

If anything is wrong with the free-block list, then fsck may rebuild the 
list, excluding all blocks in the list of allocated blocks. 

5.1.3 Free-Block Count 

The superblock contains a count of the total number of free blocks 
within the file system. Fsck compares this count to the number of 



8-5 



FSCK 



blocks it found free within the file system. If the counts do not agree, 
then fsck may replace the count in the superblock by the actual free- 
block count. 

5.1.4 Free-Inode Count 

The superblock contains a count of the total number of free inodes 
within the file system. Fsck compares this count to the number of 
inodes it found free within the file system. If the counts do not agree, 
then fsck may replace the count in the superblock by the actual free- 
inode count. 

5.2 Inodes 

An individual inode is not as likely to be corrupted as the superblock. 
However, because of the great number of active inodes, there is almost 
as likely a chance for corruption in the inode list as in the superblock. 

The list of inodes is checked sequentially starting with inode 1 (there is 
no inode 0) and going to the last inode in the file system. Each inode 
can be checked for inconsistencies involving format and type, link 
count, duplicate blocks, bad blocks, and inode size. 

5.2.1 Format and Type 

Each inode contains a mode word. This mode word describes the type 
and state of the inode. Inodes may be one of four types: 

1 . Regular 

2. Directory 

3. Special block 

4. Special character. 

If an inode is not one of these types, then the inode has an illegal type. 
Inodes may be found in one of three states— unallocated, allocated, and 
neither unallocated nor allocated. This last state indicates an incorrectly 
formatted inode. An inode can get in this state if bad data is written 
into the inode list through, for example, a hardware failure. The only 
possible corrective action is for fsck to clear the inode. 



8-6 



FSCK 



5.2.2 Link Count 



Contained in each inode is a count of the total number of directory 
entries linked to the inode. Fsck verifies the link count of each inode 
by traversing down the total directory structure, starting from the root 
directory, and calculating an actual link count for each inode. 

If the stored link count is nonzero and the actual link count is zero, it 
means that no directory entry appears for the inode. If the stored and 
actual link counts are nonzero and unequal, a directory entry may have 
been added or removed without the inode being updated. 

If the stored link count is nonzero and the actual link count is zero, 
fsck can, under operator control, link the disconnected file to the 
lost -\-found directory. If the stored and actual link counts are nonzero 
and unequal, fsck can replace the stored link count by the actual link 
count. 

5.2.3 Duplicate Blocks 

Contained in each inode is a list or pointers to lists (indirect blocks) of 
all the blocks claimed by the inode. Fsck compares each block number 
claimed by an inode to a list of already allocated blocks. If a block 
number is already claimed by another inode, the block number is added 
to a list of duplicate blocks. Otherwise, the list of allocated blocks is 
updated to include the block number. If there are any duplicate blocks, 
fsck will make a partial second pass of the inode list to find the inode 
of the duplicated block. This is necessary because without examining 
the files associated with these inodes for correct content there is not 
enough information available to decide which inode is corrupted and 
should be cleared. Most of the time, the inode with the earliest modify 
time is incorrect and should be cleared. This condition can occur by 
using a file system with blocks claimed by both the free-block list and 
by other parts of the file system. 

A large number of duplicate blocks in an inode may be due to an 
indirect block not being written to the file system. Fsck will prompt 
the operator to clear both inodes. 



8-7 



FSCK 



5.2.4 Bad Blocks 

Contained in each inode is a list or pointer to lists of all the blocks 
claimed by the inode. Fsck checks each block number claimed by an 
inode for a value lower than that of the first data block or greater than 
the last block in the file system. If the block number is outside this 
range, the block number is a bad block number. 

If there is a large number of bad blocks in an inode, this may be due to 
an indirect block not being written to the file system. Fsck will prompt 
the operator to clear both inodes. 

5.2.5 Size Checks 

Each inode contains a 32-bit (4-byte) size field. This size indicates the 
number of characters in the file associated with the inode. This size 
can be checked for inconsistencies, e.g., directory sizes that are not a 
multiple of 16 characters or the number of blocks actually used not 
matching that indicated by the inode size. 

A directory inode within the file. system has the directory bit on in the 
inode mode word. The directory size must be a multiple of 16 because 
a directory entry contains 16 bytes (2 bytes for the inode number and 
14 bytes for the file or directory name). 

Fsck will warn of such directory misalignment. This is only a warning 
because not enough information can be gathered to correct the 
misalignment. 

A rough check of the consistency of the size field of an inode can be 
performed by computing from the size field the number of blocks that 
should be associated with the inode and comparing it to the actual 
number of blocks claimed by the inode. 

Fsck calculates the number of blocks that there should be in an inode 
by dividing the number of characters in an inode by the number of 
characters per block and rounding up. Fsck adds one block for each 
indirect block associated with the inode. If the actual number of blocks 
does not match the computed number of blocks, fsck will warn of a 
possible file-size error. This is only a warning because the system does 
not fill in blocks in files created in random order. 



8-8 



FSCK 



5.3 Indirect Blocks 

Indirect blocks are owned by an inode. Therefore, inconsistencies in 
indirect blocks directly affect the inode that owns it. 



Inconsistencies that can be checked are blocks already claimed by 
another inode and block numbers outside the range of the file system. 

For a discussion of detection and correction of the inconsistencies asso- 
ciated with indirect blocks, see parts "Duplicate Blocks" and "Bad 
Blocks". 

5.4 Data Blocks 

The two types of data blocks are plain data blocks and directory data 
blocks. Plain data blocks contain the information stored in a file. 
Directory data blocks contain directory entries. Fsck does not attempt 
to check the validity of the contents of a plain data block. 

Each directory data block can be checked for inconsistencies involving 
directory inode numbers pointing to unallocated inodes, directory inode 
numbers greater than the number of inodes in the file system, incorrect 
directory inode numbers for "." and "..", and directories discon- 
nected from the file system. In addition, the validity of the contents of 
a directory's data block is checked. 

If a directory entry inode number points to an unallocated inode, then 
fsck may remove that directory entry. This condition probably 
occurred because the data blocks containing the directory entries were 
modified and written out while the inode was not yet written out. 

If a directory entry inode number is pointing beyond the end of the 
inode list, fsck may remove that directory entry. This condition occurs 
if bad data is written into a directory data block. 

The directory inode number entry for "." should be the first entry in 
the directory data block. Its value should be equal to the inode number 
for the directory data block. 



8-9 



FSCK 



The directory inode number entry for ".." should be the second entry 
in the directory data block. Its value should be equal to the inode 
number for the parent of the directory entry (or the inode number of 
the directory data block if the directory is the root directory) . 

If the directory inode numbers are incorrect, fsck may replace them 
with the correct values. 



Fsck checks the general connectivity of the file system. If directories 
are found not to be linked into the file system, fsck will link the direc- 
tory back into the file system in the lost+found directory. This condi- 
tion can be caused by inodes being written to the file system with the 
corresponding directory data blocks not being written to the file system. 

5.5 Free-List Blocks 

Free-list blocks are owned by the superblock. Therefore, inconsisten- 
cies in free-list blocks directly affect the superblock. 

Inconsistencies that can be checked are a list count outside of range, 
block numbers outside of range, and blocks already associated with the 
file system. 

For a discussion of detection and correction of the inconsistencies asso- 
ciated with free-list blocks, see part "Free-Block List". 

6. FSCK Error Conditions 

6.1 Conventions 

Fsck is a multipass file system check program. Each file system pass 
invokes a diff"erent phase of the fsck program. After the initial setup, 
fsck performs successive phases over each file system performing 
cleanup, checking blocks and sizes, pathnames, connectivity, reference 
counts, and the free-block list (possibly rebuilding it) . 

When an inconsistency is detected, fsck reports the error condition to 
the operator. If a response is required, fsck prints a prompt message 
and waits for a response. This appendix explains the meaning of each 
error condition, the possible responses, and the related error conditions. 



8-10 



FSCK 



The error conditions are organized by the "Phase" of the fsck program 
in which they can occur. The error conditions that may occur in more 
than one phase will be discussed in the next section. 

6.2 Initialization 

Before a file system check can be performed, certain tables have to be 
set up and certain files opened. This section describes the opening of 
files and the initialization of tables. Error conditions resulting from 
command line options, memory requests, opening of files, status of 
files, file system size checks, and creation of the scratch file are listed 
below. 

C option? 

C is not a legal option to fsck; legal options are — y, — n, — s, — S, — t, 
— r, — q, and — D. Fsck terminates on this error condition. See the 
fsck(lM) entry in the UniPlus^ System V Administrator's Manual for 
further details. 

Bad — t option 

The — t option is not followed by a file name. Fsck terminates on this 
error condition. See the fsck(lM) entry in the UniPlus^ System V 
Administrator's Manual for further details. 

Invalid — s argument, defaults assumed 

The — s option is not suffixed by 3, 4, or blocks-per-cylinder: blocks- to- 
skip. Fsck assumes a default value of 400 blocks-per-cylinder and 9 
blocks-to-skip. See the fsck(lM) entry in the UniPlus^ System V 
Administrator's Manual for further details. 

Incompatible options: — n and — s 

It is not possible to salvage the free-block list without modifying the file 
system. Fsck terminates on this error condition. See the fsck(lM) 
entry in the UniPlus^ System V Administrator's Manual for further 
details. 

Can not fstat standard input 

Fsck's attempt to fstat standard input failed. The occurrence of this 
error condition indicates a serious problem which may require addi- 
tional assistance. Fsck terminates on this error condition. 



8-11 



FSCK 



Can not get memory 

Fsck's request for memory for its virtual memory tables failed. The 
occurrence of this error condition indicates a serious problem which 
may require additional assistance. Fsck terminates on this error condi- 
tion. 

Can not open checkall file: F 

The default file system checkall file F (usually /etc/checkall) cannot be 
opened for reading. Fsck terminates on this error condition. Check 
access modes of F. 

Can not stat root 

Fsck's request for statistics about the root directory "/" failed. The 
occurrence of this error condition indicates a serious problem which 
may require additional assistance. Fsck terminates on this error condi- 
tion. 

Can not stat F 

Fsck's request for statistics about the file system F failed. It ignores 
this file system and continues checking the next file system given. 
Check access modes of F. 

F is not a block or character device 

Fsck has been given a regular file name by mistake. It ignores this file 
system and continues checking the next file system given. Check file 
type of F. 

Can not open F 

The file system F cannot be opened for reading. It ignores this file sys- 
tem and continues checking the next file system given. Check access 
modes of F. 

Size check: fsize X isize Y 

More blocks are used for the inode list Y than there are blocks in the 
file system X, or there are more than 65,535 inodes in the file system. 
It ignores this file system and continues checking the next file system 
given. 



8-12 



FSCK 



Can not create F 



Fsck's request to create a scratch file F failed. It ignores this file sys- 
tem and continues checking the next file system given. Check access 
modes of F. 

CAN NOT SEEK: BLK B (CONTINUE) 

Fsck's request for moving to a specified block number B in the file sys- 
tem failed. The occurrence of this error condition indicates a serious 
problem which may require additional assistance. 

Possible responses to CONTINUE prompt are: 

YES Attempt to continue to run file system check. Often, 

however, the problem will persist. This error condition 
will not allow a complete check of the file system. A 
second run of fsck should be made to recheck this file 
system. If block was part of the virtual memory buffer 
cache, fsck will terminate with the message "Fatal I/O 
error". 

NO Terminate program. 

CAN NOT READ: BLK B (CONTINUE) 

Fsck's request for reading a specified block number B in the file system 
failed. The occurrence of this error condition indicates a serious prob- 
lem which may require additional assistance. 

Possible responses to CONTINUE prompt are: 

YES Attempt to continue to run file system check. Often, 

however, the problem will persist. This error condition 
will not allow a complete check of the file system. A 
second run of fsck should be made to recheck this file 
system. If block was part of the virtual memory buffer 
cache, fsck will terminate with the message "Fatal I/O 
error". 

NO Terminate program. 



8-13 



FSCK 



CAN NOT WRITE: BLK B (CONTINUE) 

Fsck's request for writing a specified block number B in the file system 
failed. The disk is write-protected. 



Possible responses to CONTINUE prompt are: 

YES Attempt to continue to run file system check. Often, 

however, the problem will persist. This error condition 
will not allow a complete check of the file system. A 
second run of fsck should be made to recheck this file 
system. If block was part of the virtual memory buff'er 
cache, fsck will terminate with the message "Fatal I/O 
error". 

NO Terminate program. 

6.3 PHASE 1: CHECK BLOCKS AND SIZES 

This phase concerns itself with the inode list. This part lists error con- 
ditions resulting from checking inode types, setting up the zero-link- 
count table, examining inode block numbers for bad or duplicate 
blocks, checking inode size, and checking inode format. 

UNKNOWN FILE TYPE 1 = 1 (CLEAR) 

The mode word of the inode / indicates that the inode is not a special 
character inode, regular inode, or directory inode. 

Possible responses to CLEAR prompt are: 

YES Deallocate inode / by zeroing its contents. This will 

always invoke the UNALLOCATED error condition in 
Phase 2 for each directory entry pointing to this inode. 

NO Ignore this error condition. 

LINK COUNT TABLE OVERFLOW (CONTINUE) 

An internal table for fsck containing allocated inodes with a link count 
of zero has no more room. Recompile fsck with a larger value of 

MAXLNCNT. 

Possible responses to CONTINUE prompt are: 



8-14 



FSCK 



YES Continue with program. This error condition will not 

allow a complete check of the file system. A second run 
of fsck should be made to recheck this file system. If 
another allocated inode with a zero link count is found, 
this error condition is repeated. 

NO Terminate program. 

B BAD 1 = 1 

Inode / contains block number B with a number lower than the 
number of the first data block in the file system or greater than the 
number of the last block in the file system. This error condition may 
invoke the EXCESSIVE BAD BLKS error condition in Phase 1 if inode 
/ has too many block numbers outside the file system range. This error 
condition will always invoke the BAD/DUP error condition in Phase 2 
and Phase 4. 

EXCESSIVE BAD BLKS 1 = 1 (CONTINUE) 

There is more than a tolerable number (usually 10) of blocks with a 
number lower than the number of the first data block in the file system 
or greater than the number of the last block in the file system associ- 
ated with inode /. 

Possible responses to CONTINUE prompt are: 

YES Ignore the rest of the blocks in this inode and continue 

checking with next inode in the file system. This error 
condition will not allow a complete check of the file sys- 
tem. A second run of fsck should be made to recheck 
this file system. 

NO Terminate program. 

BDUPI = I 

Inode / contains block number B which is already claimed by another 
inode. This error condition may invoke the EXCESSIVE DUP BLKS 
error condition in Phase 1 if inode / has too many block numbers 
claimed by other inodes. This error condition will always invoke Phase 
lb and the BAD/DUP error condition in Phase 2 and Phase 4. 



8-15 



FSCK 



EXCESSIVE DUP BLKS 1 = 1 (CONTINUE) 

There is more than a tolerable number (usually 10) of blocks claimed 
by other inodes. 



Possible responses to CONTINUE prompt are: 

YES Ignore the rest of the blocks in this inode and continue 

checking with next inode in the file system. This error 
condition will not allow a complete check of the file sys- 
tem. A second run of fsck should be made to recheck 
this file system. 

NO Terminate program. 

DUP TABLE OVERFLOW (CONTINUE) 

An internal table in fsck containing duplicate block numbers has no 
more room. Recompile fsck with a larger value of DUPTBLSIZE. 

Possible responses to CONTINUE prompt are: 

YES Continue with program. This error condition will not 

allow a complete check of the file system. A second run 
of fsck should be made to recheck this file system. If 
another duplicate block is found, this error condition will 
repeat. 

NO Terminate program. 

POSSIBLE FILE SIZE ERROR 1 = 1 

The inode / size does not match the actual number of blocks used by 
the inode. This is only a warning. If the — q option is used, this mes- 
sage is not printed. 

DIRECTORY MISALIGNED 1 = 1 

The size of a directory inode is not a multiple of the size of a directory 
entry (usually 16). This is only a warning. If the -q option is used, 
this message is not printed. 

PARTIALLY ALLOCATED INODE 1 = 1 (CLEAR) 

Inode / is neither allocated nor unallocated. 



8-16 



FSCK 



Possible responses to CLEAR prompt are: 

YES Deallocate inode I by zeroing its contents. 

NO Ignore this error condition. 

6.4 PHASE IB: RESCAN FOR MORE DUPS 

When a duplicate block is found in the file system, the file system is 
rescanned to find the inode which previously claimed that block. This 
part lists the error condition when the duplicate block is found. 

BDUPI = I 

Inode / contains block number B which is already claimed by another 
inode. This error condition will always invoke the BAD/DUP error 
condition in Phase 2. Inodes with overlapping blocks may be deter- 
mined by examining this error condition and the DUP error condition 
in Phase 1. 

6.5 PHASE 2: CHECK PATHNAMES 

This phase concerns itself with removing directory entries pointing to 
error conditioned inodes from Phase 1 and Phase lb. This part lists 
error conditions resulting from root inode mode and status, directory 
inode pointers in range, and directory entries pointing to bad inodes. 

ROOT INODE UNALLOCATED. TERMINATING 

The root inode (always inode number 2) has no allocate mode bits. 
The occurrence of this error condition indicates a serious problem 
which may require additional assistance. The program will terminate. 

ROOT INODE NOT DIRECTORY (FIX) 

The root inode (usually inode number 2) is not directory inode type. 

Possible responses to FIX prompt are: 

YES Replace the root inode's type to be a directory. If the root 

inode's data blocks are not directory blocks, a very large 
number of error conditions will be produced. 

NO Terminate program. 



8-17 



FSCK 



DUPS/BAD IN ROOT INODE (CONTINUE) 

Phase 1 or Phase lb have found duplicate blocks or bad blocks in the 
root inode (usually inode number 2) for the file system. 



Possible responses to CONTINUE prompt are: 

YES Ignore DUPS/BAD error condition in root inode and 

attempt to continue to run the file system check. If root 
inode is not correct, then this may result in a large 
number of other error conditions. 

NO Terminate program. 

I OUT OF RANGE 1 = 1 NAME = F (REMOVE) 

A directory entry F has an inode number / which is greater than the 
end of the inode list. 

Possible responses to REMOVE prompt are: 
YES The directory entry F is removed. 

NO Ignore this error condition. 

UNALLOCATED 1 = 1 OWNER = MODE = M SIZE = S 
MTIME = T NAME = F (REMOVE) 

A directory entry F has an inode / without allocate mode bits. The 
owner O, mode M, size S, modify time T, and file name / are printed. 
If the file system is not mounted and the — n option was not specified, 
the entry will be removed automatically if the inode it points to is char- 
acter size 0. 

Possible responses to REMOVE prompt are: 
YES The directory entry F is removed. 

NO Ignore this error condition. 

DUP/BAD 1 = 1 OWNER = MODE = M SIZE = S MTIME = T 
DIR = F (REMOVE) 

Phase 1 or Phase lb have found duplicate blocks or bad blocks associ- 
ated with directory entry F, directory inode /. The owner O, mode M, 
size 5, modify time T, and directory name F are printed. 



8-18 



FSCK 

Possible responses to REMOVE prompt are: 
YES The directory entry F is removed. 

NO Ignore this error condition. 

DUP/BAD 1 = 1 OWNER = MODE = M SIZE = S MTIME = T 
FILE = F (REMOVE) 

Phase 1 or Phase lb have found duplicate blocks or bad blocks associ- 
ated with directory entry F, inode /. The owner O, mode M, size S, 
modify time T, and file name F are printed. 

Possible responses to REMOVE prompt are: 
YES The directory entry F is removed. 

NO Ignore this error condition. 

BAD BLK B IN DIR 1 = 1 OWNER = MODE = M SIZE = S 
MTIME = T 

This message only occurs when the — q option is used. A bad block 
was found in DIR inode /. Error conditions looked for in directory 
blocks are nonzero padded entries, inconsistent "." and ".." entries, 
and imbedded slashes in the name field. This error message indicates 
that the user should at a later time either remove the directory inode if 
the entire block looks bad or change (or remove) those directory 
entries that look bad. 

6.6 PHASE 3: CHECK CONNECTIVITY 

This phase concerns itself with the directory connectivity seen in Phase 
2. This part lists error conditions resulting from unreferenced direc- 
tories and missing or full lost+found directories. 

UNREF DIR 1 = 1 OWNER = MODE = M SIZE = S MTIME = T 
(RECONNECT) 

The directory inode / was not connected to a directory entry when the 
file system was traversed. The owner O, mode M, size 5", and modify 
time T of directory inode / are printed. Fsck will force the reconnec- 
tion of a nonempty directory. 



8-19 



FSCK 



Possible responses to RECONNECT prompt are: 

YES Reconnect directory inode / to the file system in directory 

for lost files (usually lost+found). This may invoke 
lost+found error condition in Phase 3 if there are prob- 
lems connecting directory inode / to lost -\- found. This 
may also invoke CONNECTED error condition in Phase 3 
if link was successful. 

NO Ignore this error condition. This will always invoke 

UNREF error condition in Phase 4, 

SORRY. NO lost + found DIRECTORY 

There is no lost -\- found directory in the root directory of the file system; 
fsck ignores the request to link a directory in lost+found. This will 
always invoke the UNREF error condition in Phase 4. Check access 
modes of lost -\- found. See fsck (IM) in the System V Administrator's 
Manual for further details. 

SORRY. NO SPACE IN lost + found DIRECTORY 

There is no space to add another entry to the lost+found directory in 
the root directory of the file system; fsck ignores the request to link a 
directory in lost+found. This will always invoke the UNREF error con- 
dition in Phase 4. Clean out unnecessary entries in lost+found or make 
lost+found\2iVgti. See fsck(lM) in the System V Adminstrator's Manual 
for further details. 

DIR 1 = 11 CONNECTED. PARENT WAS 1 = 12 

This is an advisory message indicating a directory inode 11 was success- 
fully connected to the lost+found directory. The parent inode 12 of the 
directory inode 11 is replaced by the inode number of the lost+found 
directory. 

6.7 PHASE 4: CHECK REFERENCE COUNTS 

This phase concerns itself with the link count information seen in 
Phase 2 and Phase 3. This part lists error conditions resulting from 
unreferenced files; missing or full lost+found directory; incorrect link 
counts for files, directories, or special files; unreferenced files and 
directories; bad and duplicate blocks in files and directories; and 
incorrect total free-inode counts. 



8-20 



FSCK 



UNREF FILE 1 = 1 OWNER = MODE = M SIZE = S MTIME = T 
(RECONNECT) 

Inode / was not connected to a directory entry when the file system was 
traversed. The owner O, mode M, size S, and modify time T of inode 
/ are printed. If the — n option is not set and the file system is not 
mounted, empty files will not be reconnected and will be cleared 
automatically. 

Possible responses to RECONNECT prompt are: 

YES Reconnect inode / to file system in the directory for lost 

files (usually lost -\-found) . This may invoke lost+found 
error condition in Phase 4 if there are problems connect- 
ing inode / to lost +found. 

NO Ignore this error condition. This will always invoke 

CLEAR error condition in Phase 4. 

SORRY. NO lost + found DIRECTORY 

There is no lost+found directory in the root directory of the file system; 
fsck ignores the request to link a file in lost+found. This will always 
invoke CLEAR error condition in Phase 4. Check access modes of 
lost + found. 

SORRY. NO SPACE IN lost + found DIRECTORY 

There is no space to add another entry to the lost+found directory in 
the root directory of the file system; fsck ignores the request to link a 
file in lost+found. This will always invoke the CLEAR error condition 
in Phase 4. Check size and contents of lost+found. 

(CLEAR) 

The inode mentioned in the immediately previous error condition can- 
not be reconnected. 



Possible responses to CLEAR prompt are: 

YES Deallocate inode mentioned in the immediately previous 

error condition by zeroing its contents. 

NO Ignore this error condition. 



8-21 



FSCK 

LINK COUNT FILE 1 = 1 OWNER = MODE = M SIZE = S 
MTIME = T COUNT = X SHOULD BE Y (ADJUST) 

The link count for inode /, which is a file, is X but should be Y. The 
owner O, mode A/, size S, and modify time T are printed. 

Possible responses to ADJUST prompt are: 

YES Replace link count of file inode / with Y. 

NO Ignore this error condition. 

LINK COUNT DIR 1 = 1 OWNER = MODE = M SIZE = S 
MTIME = T COUNT = X SHOULD BE Y (ADJUST) 

The link count for inode /, which is a directory, is X but should be Y. 
The owner O, mode M, size 5", and modify time T of directory inode / 
are printed. 

Possible responses to ADJUST prompt are: 

YES Replace link count of directory inode / with Y. 

NO Ignore this error condition. 

LINK COUNT F 1 = 1 OWNER = MODE = M SIZE = S 
MTIME = T COUNT = X SHOULD BE Y (ADJUST) 

The link count for F inode I \^ X but should be Y. The file name F, 
owner O, mode M, size 5', and modify time T are printed. 

Possible responses to ADJUST prompt are: 

YES Replace link count of inode / with Y. 

NO Ignore this error condition. 

UNREF FILE 1 = 1 OWNER = MODE = M SIZE = S MTIME = T 
(CLEAR) 

Inode /, which is a file, was not connected to a directory entry when 
the file system was traversed. The owner O, mode MR, size S, and 
modify time T of inode / are printed. If the — n option is not set and 
the file system is not mounted, empty files will be cleared automati- 
cally. 



8-22 



FSCK 



Possible responses to CLEAR prompt are: 

YES Deallocate inode / by zeroing its contents. 

NO Ignore this error condition. 

UNREF DIR 1 = 1 OWNER = MODE = M SIZE = S MTIME = T 
(CLEAR) 

Inode /, which is a directory, was not connected to a directory entry 
when the file system was traversed. The owner O, mode M, size S, 
and modify time T of inode / are printed. If the - n option is not set 
and the file system is not mounted, empty directories will be cleared 
automatically. Nonempty directories will not be cleared. 

Possible responses to CLEAR prompt are: 

YES Deallocate inode / by zeroing its contents. 

NO Ignore this error condition. 

BAD/DUP FILE 1 = 1 OWNER = MODE = M SIZE = S 
MTIME = T (CLEAR) 

Phase 1 or Phase lb have found duplicate blocks or bad blocks associ- 
ated with file inode /. The owner O, mode M, size S, and modify time 
T of inode / are printed. 

Possible responses to CLEAR prompt are: 

YES Deallocate inode / by zeroing its contents. 

NO Ignore this error condition. 

BAD/DUP DIR 1 = 1 OWNER = MODE = M SIZE = S MTIME = T 
(CLEAR) 

Phase 1 or Phase lb have found duplicate blocks or bad blocks associ- 
ated with directory inode /. The owner O, mode M, size S, and 
modify time T of inode / are printed. 

Possible responses to CLEAR prompt are: 

YES Deallocate inode / by zeroing its contents. 



8-23 



FSCK 



NO Ignore this error condition. 

FREE INODE COUNT WRONG IN SUPERBLK (FIX) 

The actual count of the free inodes does not match the count in the 
superblock of the file system. If the — q option is specified, the count 
will be fixed automatically in the superblock. 



Possible responses to FIX prompt are: 

YES Replace count in superblock by actual count. 

NO Ignore this error condition. 

6.8 PHASES: CHECK FREE LIST 

This phase concerns itself with the free-block list. This part lists error 
conditions resulting from bad blocks in the free-block list, bad free- 
blocks count, duplicate blocks in the free-block list, unused blocks from 
the file system not in the free-block list, and the total free-block count 
incorrect. 

EXCESSIVE BAD BLKS IN FREE LIST (CONTINUE) 

The free-block list contains more than a tolerable number (usually 10) 
of blocks with a value less than the first data block in the file system or 
greater than the last block in the file system. 

Possible responses to CONTINUE prompt are: 

YES Ignore rest of the free-block list and continue execution of 

fsck. This error condition will always invoke "BAD 
BLKS IN FREE LIST" error condition in Phase 5. 

NO Terminate program. 

EXCESSIVE DUP BLKS IN FREE LIST (CONTINUE) 

The free-block list contains more than a tolerable number (usually 10) 
of blocks claimed by inodes or earlier parts of the free-block list. 

Possible responses to CONTINUE prompt are: 

YES Ignore the rest of the free-block list and continue execu- 

tion of fsck. This error condition will always invoke 



8-24 



FSCK 

"DUP BLKS IN FREE LIST" error condition in Phase 5. 
NO Terminate program. 

BAD FREEBLK COUNT 

The count of free blocks in a free-Hst block is greater than 50 or less 
than 0. This error condition will always invoke the "BAD FREE 
LIST" condition in Phase 5. 

X BAD BLKS IN FREE LIST 

X blocks in the free-block list have a block number lower than the first 
data block in the file system or greater than the last block in the file 
system. This error condition will always invoke the "BAD FREE 
LIST" condition in Phase 5. 

X DUP BLKS IN FREE LIST 

X blocks claimed by inodes or earlier parts of the free-list block were 
found in the free-block list. This error condition will always invoke the 
"BAD FREE LIST" condition in Phase 5. 

X BLK(S) MISSING 

X blocks unused by the file system were not found in the free-block 
list. This error condition will always invoke the "BAD FREE LIST" 
condition in Phase 5. 

FREE BLK COUNT WRONG IN SUPERBLOCK (FIX) 

The actual count of free blocks does not match the count in the super- 
block of the file system. 

Possible responses to FIX prompt are: 

YES Replace count in superblock by actual count. 

NO Ignore this error condition. 

BAD FREE LIST (SALVAGE) 

Phase 5 has found bad blocks in the free-block list, duplicate blocks in 
the free- block list, or blocks missing from the file system. If the — q 
option is specified, the free-block list will be salvaged automatically. 



8-25 



FSCK 



Possible responses to SALVAGE prompt are: 

YES Replace actual free-block list with a new free-block list. 

The new free-block list will be ordered to reduce time 
spent by the disk waiting for the disk to rotate into posi- 
tion. 

NO Ignore this error condition. 

6.9 PHASE 6: SALVAGE FREE LIST 

This phase concerns itself with the free-block list reconstruction. This 
part lists error conditions resulting from the blocks-to-skip and blocks- 
per-cylinder values. 

Default free-block list spacing assumed 

This is an advisory message indicating the blocks-to-skip is greater than 
the blocks-per-cylinder, the blocks-to-skip is less than 1, the blocks- 
per-cylinder is less than 1, or the blocks-per-cylinder is greater than 
500. The default values of 9 blocks-to-skip and 400 blocks-per-cylinder 
are used. See fsck(lM) in the System V Administrator's Manual for 
further details. 

6.10 CLEANUP 

Once a file system has been checked, a few cleanup functions are per- 
formed. This part lists advisory messages about the file system and 
modify status of the file system. 

X files Y blocks Z free 

This is an advisory message indicating that the file system checked con- 
tained X files using Y blocks leaving Z blocks free in the file system. 

***** BOOT UNIX (NO SYNC!) ***** 

This is an advisory message indicating that a mounted file system or the 
root file system has been modified by fsck. If the UniPlus"*" system is 
not rebooted immediately without sync, the work done by fsck may be 
undone by the in-core copies of tables the UniPlus"*" system keeps. 

***** FILE SYSTEM WAS MODIFIED ***** 

This is an advisory message indicating that the current file system was 
modified by fsck. 



8-26 



Chapter 9: LP SPOOLING SYSTEM 



CONTENTS 



1. General 1 

2. Overview of LP Features 1 

2.1 Definitions 1 

2.2 Commands 2 

2.2.1 Commands for General Use 2 

2.3 Commands for LP Administrators 2 

3. Building LP 3 

4. Configuring LP— The "Ipadmin" Command 4 

4.1 Introducing New Destinations 4 

4.2 Modifying Existing Destinations 6 

4.3 Specifying the System Default Destination 7 

4.4 Removing Destinations 8 

5. Making an Output Request— The "Ip" Command .... 8 

6. Finding LP Status- "Ipstat" 10 

7. Canceling Requests— "cancel" 10 

8. Allowing and Refusing Requests— Accept and 

Reject 10 

9. Allowing and Inhibiting Printing— Enable and 

Disable 11 

10. Moving Requests Between Destinations— 

"Ipmove" 12 

11. Stopping and Starting the Scheduler— "Ipshut" and 

"Ipsched" 13 

12. Printer Interface Programs 14 

13. Setting Up Hard-Wired Devices and Login Terminals as LP 
Printers 16 

13.1 Hard-wired Devices 16 

13.2 Login Terminals 17 

14. Summary 18 



- 1 



Chapter 9 
LP SPOOLING SYSTEM 



1. General 

The line printer (LP) program is a series of commands that perform 
diverse spooling functions under UniPlus"*". Since the primary LP 
application is off-line printing, this document focuses mainly on spool- 
ing to line printers. LP allows administrators to spool to a collection of 
line printers of any type and to group printers into logical classes to 
maximize the throughput of the devices. Users can: 

• Queue and cancel print requests. 

• Prevent and allow queuing to devices. 

• Start and stop LP from processing requests. 

• Change printer configuration. 

• Find status of the LP system. 

This chapter describes the role of an LP administrator. 

2. Overview of LP Features 

2.1 Definitions 

We define several terms before presenting a brief summary of LP com- 
mands. The LP was designed to meet the needs of users on different 
UniPlus+ systems. Changes to the LP configuration are performed by 
the Ipadmin(lM) command. 

LP makes a distinction between printers and printing devices. A device 
is a physical peripheral device or a file and is represented by a full 
UniPlus"*" system pathname. A printer is a logical name that represents 
a device. At different times, a printer may be associated with different 
devices. A class is a name given to an ordered list of printers. Every 
class must contain at least one printer. Each printer may be a member 
of zero or more classes. A destination is a printer or a class. One desti- 
nation may be designated as the system default destination. The lp(l) 
command directs all output to this destination unless the user specifies 
otherwise. Output that is routed to a printer will be printed only by 



9-1 



LP SPOOLING 



that printer, whereas output directed to a class will be printed by the 
first available class member. 



Each invocation of Ip creates an output request that consists of the files 
to be printed and options from the Ip command line. An interface pro- 
gram which formats requests must be supplied for each printer. The 
LP scheduler, Ipsched(lM), services requests for all destinations by 
routing requests to interface programs to do the printing on devices. 
An LP configuration for a system consists of devices, destinations, and 
interface programs. 

2.2 Commands 

2.2.1 Commands for General Use 

The lp(l) command is used to request printing files. It creates an out- 
put request and returns a request id of the form 

dest-seqno 

to the user, where seqno is a unique sequence number across the entire 
LP system and dest is the destination where the request was routed. 

Cancel cancels output requests. The user supplies request ids as 
returned by Ip or printer names, in which case the currently printing 
requests on those printers are canceled. 

Disable prevents Ipsched from routing output requests to printers. 

Enabled) allows Ipsched to route output requests to printers. 

2.3 Commands for LP Administrators 

Each LP system must designate a person or persons as LP administrator 
to perform the restricted functions listed below. Either the superuser 
or any user who is logged into the UniPlus"*" system as Ip qualifies as an 
LP administrator. All LP files and commands are owned by Ip except 
for Ipadmin and Ipsched which are owned by root. The following com- 
mands are described in more detail later in this chapter, 

Ipadmin (IM) Modifies LP configuration. Many features of this 
command cannot be used when Ipsched is running. 



9-2 



LP SPOOLING 

Ipsched(lM) Routes output requests to interface programs which 

do the printing on devices. 

Ipshut Stops Ipsched from running. All printing activity is 

halted, but other LP commands may still be used. 

accept (IM) Allows Ip to accept output requests for destinations. 

reject Prevents Ip from accepting requests for destinations. 

Ipmove Moves output requests from one destination to 

another. Whole destinations may be moved at one 
time. This command cannot be used when Ipsched 
is running. 

3. Building LP 

All LP commands are built from source code that resides in the 
/usr/src/cmdAp directory, including the make file, Ip.mk. Unless some of 
the definitions in Ip.mk are changed, LP may be installed only by the 
superuser. Before installing a new LP system, make sure there is a 
login called "Ip" on your system and that the spool directory, 
/usr/spoolAp, does not exist. To install LP, do the following: 

cd /usr/src/cmd/lp 
make — f Ip.mk install 

This builds all LP commands and creates an initial LP configuration 
consisting of no printers, classes, or default destination. LP must be 
configured by an LP administrator using the Ipadmin command to 
create a useful spooler. 

In addition, add the following code to /etc/rc. 

rm -f /usr/spool/lp/SCHEDLOCK 

/usr/lib/lpsched 

echo "LP scheduler started" 

This starts the LP scheduler each time that UniPlus"^ is restarted. 

Several variables in Ip.mk may be changed before installing LP to cus- 
tomize the system: 



9-3 



LP SPOOLING 




Variable 


Default Value 


SPOOL 


/usr/spoolAp 


ADMIN 


Ip 


GROUP 


bin 


ADMDIR 


/usrAib 


USRDIR 


/usr/bin 



Meaning 

spool directory 

logname of LP Administrator 
group owning LP commands/data 
commands of administrator 
user commands reside here 



If an existing LP spool directory is corrupted (but not the LP programs) 
or if it needs to be rebuilt from scratch, make sure that Ipsched is not 
running and do the following as superuser: 

1. Make copies of any interface programs that are not standard LP 
software. DO NOT make these copies underneath the spool 
directory. The pathname for printer "p" is /usr/spoolflp/interface/p. 

2. rm — fr /usr/spool/lp 

3. Make — / Ip.mk new. (This recreates the bare LP configuration 
described above.) 

PRECAUTIONS 

1. Some LP commands invoke other LP commands. Moving them 
after they are built will cause some commands to fail. 

2. The files under the SPOOL directory should be modified only by 
LP commands. 

3. All LP commands require set-user-id permission. If this is 
removed, the commands will fail. 

4. Configuring LP— The "Ipadmin" Command 

Changes to the LP configuration should be made by using the Ipadmin 
command and not by hand. Lpadmin will not attempt to alter the LP 
configuration when Ipsched is running, except where explicitly noted 
below. 

4.1 Introducing New Destinations 

The following information must be supplied to Ipadmin when introduc- 
ing a new printer: 

1. The printer name ( — p printer) is an arbitrary name which must 
conform to the following rules: 



9-4 



LP SPOOLING 



• It must be no longer than 14 characters. 

• It must consist solely of alphanumeric characters and under- 
scores. 

• It must not be the name of an existing LP destination 
(printer or class). 

2. The device associated with the printer ( — v device). This is the 
pathname of a hard-wired printer, a login terminal, or other file 
that is writable by Ip. 

3. The printer interface program. This may be specified in one of 
three ways: 

• It may be selected from a list of model interfaces supplied 
with LP ( — m model) . 

• It may be the same interface that an existing printer uses 
( — e printer). 



• 



It may be a program supplied by the LP administrator (-i 
interface) . 



Information which need not always be supplied when creating a new 
printer includes: 

1. The user may specify -h to indicate that the device for the 
printer is hardwired or the device is the name of a file (this is 
assumed by default). If, on the other hand, the device is the 
pathname of a login terminal, then —1 must be included on the 
command line. This indicates to Ipsched that it must automatically 
disable this printer each time Ipsched starts running. This fact is 
reported by Ipstat when it indicates printer status: 

$ Ipstat —pa 

printer a (login terminal) disabled Oct 31 11:15 — 
disabled by scheduler: login terminal 

This is done because device names for login terminals can be (and 
usually are) associated with diff"erent physical devices from day to 
day. If the scheduler did not take this action, somebody might 
log in and be surprised that LP is spooling to his/her terminal! 

2. The new printer may be added to an existing class or added to a 
new class ( — cclass). New class names must conform to the same 
rules for new printer names. 



9-5 



LP SPOOLING 



EXAMPLES 



The following examples will be referenced by further examples in later 
sections. 

1. Create a printer called prl whose device is /dev/printer and whose 
interface program is the model hp interface: 

$ /usr/lib/lpadmin — pprl —v/dev/ printer — mhp 

2. Add a printer called pr2 whose device is /dev/tty22 and whose 
interface is a variation of the model prx interface. It is also a 
login terminal: 

$ cp /usr/spool/lp/model/prx xxx 

< edit xxx > 
$ /usr/lib/lpadmin — ppr2 — v/dev/ tty22 — ixxx —1 

3. Create a printer called pr3 whose device is /dev/tty23. The pr3 will 
be added to a new class called cU and will use the same interface 
as printer pr2: 

$ /usr/lib/lpadmin — ppr3 — v/dev/ tty23 — epr2 —cell 

4.2 Modifying Existing Destinations 

Modifications to existing destinations must always be made with respect 
to a printer name ( — pprinter). The modifications may be one or more 
of the following: 

1 . The device for the printer may be changed ( — vdevice) . If this is 
the only modification, then this may be done even while Ipsched is 
running. This facilitates changing devices for login terminals. 

2. The printer interface program may be changed ( — mmodel, 
— eprinter, — iinterface) . 

3. The printer may be specified as hardwired ( — h) or as a login ter- 
minal ( — 1). 

4. The printer may be added to a new or existing class ( — cclass). 

5. The printer may be removed from an existing class ( — rclass). 
Removing the last remaining member of a class causes the class 
to be deleted. No destination may be removed if it has pending 
requests. In that case, Ipmove or cancel should be used to move 
or delete the pending requests. 



9-6 



LP SPOOLING 



EXAMPLES 



These examples are based on the LP configuration created by those in 
the previous section. 

1. Add printer pr2 to class ell: 

$ /usr/lib/lpadmin — ppr2 —cell 

2. Change pr2's interface program to the model prx interface, 
change its device to /dev/tty24, and add it to a new class called cl2: 

$ /usr/lib/lpadmin — ppr2 — mprx — v/dev/tty24 — ccl2 

Note that printers pr2 and pr3 now use different interface pro- 
grams even though pr3 was originally created with the same inter- 
face as pr2. Printer pr2 is now a member of two classes. 

3. Specify printer pr2 as a hard-wired printer: 

$ /usr/lib/lpadmin — ppr2 — h 

4. Add printer prl to class cl2: 

$ /usr/lib/lpadmin — pprl — ccl2 

The members of class cl2 are now pr2 and prl, in that order. 
Requests routed to class cl2 will be serviced by pr2 if both pr2 
and prl are ready to print; otherwise, they will be printed by the 
one which is next ready to print. 

5. Remove printers pr2 and pr3 from class cU: 

$ /usr/lib/lpadmin — ppr2 — rcU 
$ /usr/lib/lpadmin — ppr3 — rcU 

Since pr3 was the last remaining member of class cU, the class is 
removed. 

6. Add pr3 to a new class called cl3. 

$ /usr/lib/lpadmin — ppr3 — ccl3 

4.3 Specifying the System Default Destination 

The system default destination may be changed even when Ipsched is 
running. 



9-7 



LP SPOOLING 

EXAMPLES 

1. Establish class cU as the system default destination: 

$ /usr/lib/lpadmin —dell 

2. Establish no default destination: 

$ /usr/lib/lpadmin — d 

4.4 Removing Destinations 

Classes and printers may be removed only if there are no pending 
requests that were routed to them. Pending requests must either be 
canceled using cancel or moved to other destinations using Ipmove 
before destinations may be removed. If the removed destination is the 
system default destination, then the system will have no default desti- 
nation until the default destination is respecified. When the last 
remaining member of a class is removed, then the class is also 
removed. Removing a class never implies removing printers. 

EXAMPLES 

1. Make printer prl the system default destination: 

$ /usr/lib/lpadmin — dprl 
Remove printer prl: 

$ /usr/lib/lpadmin — xprl 
Now there is no system default destination. 

2. Remove printer pr2: 

$ /usr/lib/lpadmin — xpr2 
Class cl2 is also removed since pr2 was its only member. 

3. Remove class cl3: 

$ /usr/lib/lpadmin — xclS 
Class cl3 is removed, but printer pr3 remains. 

5. Making an Output Request— The '*lp" Command 

Once LP destinations have been created, users may request output by 
using the Ip command. The request id that is returned may be used to 
see if the request has been printed or to cancel the request. 



9-8 



LP SPOOLING 

The LP program determines the destination of a request by checking 
the following list in order: 

• If the user specifies —ddest on the command line, then the 
request is routed to dest. 

• If the environment variable LPDEST is set, the request is routed 
to the value of LPDEST. 

• If there is a system default destination, then the request is routed 
there. 

• The request is rejected. 

EXAMPLES 

1. There are at least four ways to print the password file on the sys- 
tem default destination: 

Ip /etc/passwd 
Ip < /etc/passwd 
cat /etc/passwd I Ip 
Ip — c /etc/passwd 

The last three ways print copies of the file, whereas the first way 
prints the file directly. Thus, if the file is modified between the 
time the request is made and the time it is actually printed, the 
changes will be reflected in the output. 

2. Print two copies of file abc on printer xyz and title the output 
"my file": 

pr abc I Ip — dxyz — n2 — t"my file" 

3. Print file xxx on a Diablo* 1640 printer called zoo in 12-pitch and 
write to the user's terminal when printing has completed: 

Ip — dzoo — ol2 — w xxx 

In this example, "12" is an option that is meaningful to the 
model Diablo 1640 interface program that prints output in 12- 
pitch mode [see Ipadmin(lM)]. 



* Registered trademark of Xerox Corporation 



9-9 



LP SPOOLING 

6. Finding LP Status- "Ipstat" 

The Ipstat command finds status information about LP requests, desti- 
nations, and the scheduler. 

EXAMPLES 

1 . List the status of all pending output requests made by this user: 

Ipstat 

The status information for a request includes the request id, the 
logname of the user, the total number of characters to be printed, 
and the date and time the request was made. 

2. List the status of printers pi and p2: 

Ipstat — ppl,p2 

7. Canceling Requests— "cancel" 

You can cancel LP requests with the cancel command. Two kinds of 
arguments may be given to the command— request ids and printer 
names. The requests named by the request ids are canceled and 
requests that are currently printing on the named printers are canceled. 
Both types of arguments may be intermixed. 

EXAMPLE 

Cancel the request that is now printing on printer xyz: 
cancel xyz 

If the user that is canceling a request is not the same one that made the 
request, then mail is sent to the owner of the request. LP allows any 
user to cancel requests in order to eliminate the need for users to find 
LP administrators when unusual output should be purged from printers. 

8. Allowing and Refusing Requests— Accept and Reject 

When a new destination is created, Ip rejects requests that are routed to 
it. When the LP administrator is sure that it is set up correctly, he or 
she should allow Ip to accept requests for that destination. The accept 
command performs this function. 



9-10 



LP SPOOLING 

Sometimes it is necessary to prevent Ip from routing requests to desti- 
nations. If printers have been removed or are waiting to be repaired or 
if too many requests are building for printers, then you may want to 
have Ip reject requests for those destinations. The reject command per- 
forms this function. After the condition that led to the rejection of 
requests has been remedied, the accept command should be used to 
allow requests to be taken again. 

The acceptance status of destinations is reported by the —a option of 
Ipstat. 

EXAMPLES 

1. Cause Ip to reject requests for destination xyz: 

/usr/lib/reject —r" printer xyz needs repair" xyz 

Any users that try to route requests to xyz will encounter the fol- 
lowing: 

$ Ip — dxyz file 

Ip: can not accept requests for destination "xyz" 
— printer xyz needs repair 

2. Allow Ip to accept requests routed to destination xyz: 

/usr/lib/accept xyz 

9. Allowing and Inhibiting Printing— Enable and Disable 

The enable command allows the LP scheduler to print requests on 
printers. That is, the scheduler routes requests only to the interface 
programs of enabled printers. Note that it is possible to enable a 
printer and at the same time prevent further requests from being 
routed to it. 

The disable command will undo the effects of the enable command. It 
prevents the scheduler from routing requests to printers, independently 
of whether Ip is allowing them to accept requests. Printers may be dis- 
abled for several reasons including malfunctioning hardware, paper 
jams, and end of day shutdowns. If a printer is busy at the time it is 
disabled, then the request that was printing will be reprinted in its 
entirety either on another printer (if the request was originally routed 
to a class of printers) or on the same one when the printer is re- 
enabled. The — c option cancels the currently printing requests on busy 



9-11 



LP SPOOLING 

printers in addition to disabling the printers. This is useful if strange 
output is causing a printer to behave abnormally. 

EXAMPLE 

Disable printer xyz because of a paper jam: 

$ disable —r" paper jam" xyz 
printer "xyz" now disabled 



Find the status of printer xyz: 

$ Ipstat — pxyz 

printer "xyz" disabled since Jan 5 10:15 — 
paper jam 



Now, re-enable xyz: 

$ enable xyz 

printer "xyz" now enabled 

10. Moving Requests Between Destinations— *'lpmove" 

Occasionally, it is useful for LP administrators to move output requests 
between destinations. For instance, when a printer is down for repairs, 
it may be desirable to move all of its pending requests to a working 
printer. This is one way to use the Ipmove command. The other use 
of this command is moving specific requests to a different destination. 
Lpmove will refuse to move requests while the LP scheduler is run- 
ning. 

EXAMPLES 

1. Move all requests for printer abc to printer xyz: 

$ /usr/lib/lpmove abc xyz 

All of the moved requests are renamed from abc-nnn to xyz-nnn. 
As a side effect, destination abc is no longer accepting further 
requests. 

2. Move requests zoo-543 and abc-1200 to printer xyz: 

$ /usr/lib/lpmove zoo-543 abc-1200 xyz 
The two requests are now renamed xyz-543 and xyz-1200. 



9-12 



LP SPOOLING 

11. Stopping and Starting the Scheduler— "Ipshut" and 
'ipsched" 

Lpsched is the program that routes the output requests (made with Ip) 
through the appropriate printer interface programs to be printed on line 
printers. Each time the scheduler routes a request to an interface pro- 
gram, it records an entry in the log file, /usr/spoolApAog. This entry con- 
tains the logname of the user that made the request, the request id, the 
name of the printer that the request is being printed on, and the date 
and time that printing first started. If a request has been restarted, 
more than one entry in the log file may refer to the request. The 
scheduler also records error messages in the log file. When lpsched is 
started, it renames /usr/spoolApAog to /usr/spoolAp/oldlog and starts a new 
log file. 

No printing will be performed by the LP system unless lpsched is run- 
ning. Use the command 

Ipstat — r 

to find the status of the LP scheduler. 

Lpsched is normally started by the /etc/rc program, as described above, 
and continues to run until the UniPlus"*" system is shut down. The 
scheduler operates in the /usr/spool/lp directory. When it starts running, 
it will exit immediately if a file called SCHEDLOCK exists. Otherwise, 
it creates this file to prevent more than one scheduler from running at 
the same time. 



Occasionally, it is necessary to shut down the scheduler to reconfigure 
LP or to rebuild the LP software. The command 

/usr/lib/lpshut 

causes lpsched to stop running and terminates all printing. All requests 
that were in the middle of printing will be reprinted in their entirety 
when the scheduler is restarted. 



To restart the LP scheduler, use the command 
/usr/lib/lpsched 



9-13 



LP SPOOLING 

Shortly after this command is entered, Ipstat should report that the 
scheduler is running. If not, it is possible that a previous invocation of 
Ipsched exited without removing SCHEDLOCK, so try the following: 

rm -f /usr/spool/lp/SCHEDLOCK 
/usr/ lib/ Ipsched 

The scheduler should be running now. 

12. Printer Interface Programs 

Every LP printer must have an interface program which does the actual 
printing on the device that is currently associated with the printer. 
Interface programs may be shell procedures, C programs, or any other 
executable program. The LP model interfaces are all written as shell 
procedures and can be found in the /usr/spoolAp/model directory. At the 
time ipsched routes an output request to a printer P, the interface pro- 
gram for P is invoked in the directory /usr/spool/lp as follows: 

interface/P id user title copies options file ... 

where 

id is the request id returned by Ip 

user is logname of user who made the request 

title is optional title specified by the user 

copies is number of copies requested by user 

options is a blank-separated list of class or 

printer-dependent options specified by user 

file is the full pathname of a file to be printed 

EXAMPLES 

The following examples are requests made by user "smith" with a sys- 
tem default destination of printer "xyz". Each example lists an Ip 
command line followed by the corresponding command line generated 
for printer xyz's interface program: 

L Ip /etc/ passwd /etc/group 

interface/xyz xyz-52 smith "" 1 "" /etc/passwd /etc/group 

2. pr /etc/passwd 1 Ip — f'users" — n5 
interface/xyz xyz— 53 smith users 5 "" 

/usr/spool/lp/request/xyz/dO — 53 

3. Ip /etc/passwd — oa — ob 

interface/xyz xyz — 54 smith "" 1 "a b" /etc/passwd 



9-14 



LP SPOOLING 

When the interface program is invoked, its standard input comes from 
/dev/null and both the standard output and standard error output are 
directed to the printer's device. Devices are opened for reading as well 
as writing when file modes permit. When a device is a regular file, all 
output is appended to the end of the file. 

Given the command line arguments and the output directed to a dev- 
ice, interface programs may format their output in any way they choose. 
Interface programs must ensure that the proper stty modes (terminal 
characteristics such as baud rate, output options, etc.) are in effect on 
the output device. This may be done in a shell interface only if the 
device is opened for reading: 

stty mode ... <&1 

That is, take the standard input for the stty command from the device. 

When printing has completed, it is the responsibility of the interface 
program to exit with a code indicative of the success of the print job. 
Exit codes are interpreted by Ipsched as follows: 

CODE MEANING TO LPSCHED 

The print job has completed successfully. 

1 to 127 A problem was encountered in printing this par- 

ticular request (e.g., too many nonprintable char- 
acters). This problem will not affect future print 
jobs. Lpsched notifies users by mail that there 
was an error in printing the request. 

greater than 127 These codes are reserved for internal use by 
Ipsched. Interface programs must not exit with 
codes in this range. 

When problems that are likely to affect future print jobs occur (e.g., a 
device filter program is missing), the interface programs would be wise 
to disable printers so that print requests are not lost. When a busy 
printer is disabled, the interface program will be terminated with signal 
15. 



9-15 



LP SPOOLING 

13. Setting Up Hard- Wired Devices and Login Terminals as 
LP Printers 

13.1 Hard-wired Devices 

As an example of how to set up a hard-wired device for use as an LP 
printer, consider using tty line 15 as printer xyz. As superuser, per- 
form the following: 

1. Avoid unwanted output from non-LP processes and ensure that 
LP can write to the device: 

$ chown Ip /dev/ttyl5 
$chmod600/dev/ttyl5 

2. Change /etc/inittab so that tty 15 is not a login terminal. In other 
words, ensure that /etc/getty is not trying to log users in at this ter- 
minal. Change the entries for tty 15 to: 

15:2:off:/etc/getty -t60 ttyl5 1200 

Enter the command: 

$ telinit Q 

If there is currently an invocation of /etc/getty running on ttyl5, 
kill it. When the UniPlus"*" system is rebooted, tty 15 will be ini- 
tialized with default stty modes. Thus, it is up to LP interface 
programs to establish the proper baud rate and other stty modes 
for correct printing to occur. 

3. Introduce printer xyz to LP using the model prx interface pro- 
gram: 

$ /usr/lib/lpadmin — pxyz — v/dev/ttyl5 — mprx 

4. When xyz is created, it will initially be disabled and Ip will be 
rejecting requests routed to it. If it is desired, allow Ip to accept 
requests for xyz: 

/usr/lib/ accept xyz 

This will allow requests to build up for xyz and to print when it is 
enabled at a later time. 

5. When it is desired for printing to occur, be sure that the printer is 
ready to receive output. For several printers, this means that the 
top of form has been adjusted and that the printer is on-line. 
Enable printing to occur on xyz: 



9-16 



LP SPOOLING 

enable xyz 
When requests have been routed to xyz, they will begin printing. 

13.2 Login Terminals 

Login terminals niay also be used as LP printers. To do this for a Dia- 
blo 1640 terminal called abc, perform the following: 

1. Introduce printer abc to LP using the model 1640 interface pro- 
gram: 

$ /usr/lib/lpadmin — pabc — v/dev/nuU — ml640 —1 

Note that /dev/null is used as abc's device because we will specify 
the actual device each time that abc is enabled. This device may 
be different from day to day. When abc is created, it will initially 
be disabled; and Ip will be rejecting requests routed to it. If it is 
desired, allow Ip to accept requests for abc: 

/usr/ lib/ accept abc 

This will allow requests to build up for abc and to be printed 
when it is enabled at a later time. It is not advisable to enable abc 
for printing, however, until the following steps have been taken. 

2. Log terminal in if this has not already been done. 

3. Assuming the tty(l) command reports that this terminal is 
/de\/tty02, associate this device with printer abc: 

$ /usr/lib/lpadmin —pabc — v/dev/tty02 

Note that Ipadmin may be used only by an LP administrator. If it 
is desired for other users to routinely perform this step, then an 
LPA may establish a program owned by Ip or by root with set- 
user-id permission that performs this function. 

4. When it is desired for printing to occur, be sure that the printer is 
ready to receive output. For several printers, this means that the 
top of form has been adjusted. Enable printing to occur on abc: 

enable abc 

When requests have been routed to abc, they will begin printing. 

5. When all printing has stopped on abc or when you want it back as 
a regular login terminal, you may prevent it from printing more 
output: 



9-17 



LP SPOOLING 

$ disable abc 

printer "abc" now disabled 

If abc is enabled when UniPlus"*" is rebooted or when Ipsched is 
restarted, it will be disabled automatically. 

14. Summary 

The administrative functions of the LP administrator have been 
described in detail. These functions include configuring and 
reconfiguring LP; maintaining printer interface programs; accepting, 
rejecting, and moving print requests; stopping and starting the LP 
scheduler; and enabling and disabling printers. LP offers administrators 
the following advantages over other centrally supported printer pack- 
ages: 

• Printers may be grouped into classes. 

• LP may be configured to meet the needs of each site. 

• Administrators may supply interface programs to format output in 
any way desirable. 

• LP functions are performed by simple commands and not by 
hand. 



9-18 



Chapter 10: SYSTEM ACTIVITY PACKAGE 

CONTENTS 

1. General 1 

2. System Activity Counters 2 

3. System Activity Commands 5 

3.1 The "sar" Command 5 

3.2 The "sag" Command 6 

3.3 The "timex" Command 6 

3.4 The "sadp" Command 6 

4. Daily Report Generation 7 

4.1 Facilities 7 

4.2 Suggested Operational Setup 8 

5. File Descriptions 9 

6. The "sysinfo" Structure 11 

7. Reporting Items 12 

7.1 CPU Utilization 12 

7.2 Cache Hit Ratio 12 

7.3 Disk or Tape I/O Activity 12 

7.4 Queue Activity 12 

7.5 The Rest of System Activity 12 



1 - 



Chapter 10 
SYSTEM ACTIVITY PACKAGE 



1. General 

This chapter describes the design and implementation of the UniPlus''' 
System Activity Package. UniPlus"'' contains several counters that are 
incremented as system actions occur. The system activity package 
reports UniPlus"*" system-wide measurements, including central pro- 
cessing unit (CPU) utilization, disk and tape input/output (I/O) activi- 
ties, terminal device activity, buffer usage, system calls, system switch- 
ing and swapping, file-access activity, queue activity, and message and 
semaphore activities. 

The package has four commands that generate various types of reports. 
Procedures that automatically generate daily reports are also included. 
The five functions of the activity package are: 



• 



sard) command— allows a user to generate system activity 
reports in real-time and to save system activities in a file for later 
use. 

sag(lG) command— displays system activity in a graphical form. 

sadp(l) command— samples disk activity once every second dur- 
ing a specified time interval and reports disk usage and seek dis- 
tance in either tabular or histogram form. 

timex(l)— a modified time(l) command that times a command 
and also (optionally) reports concurrent system activity and pro- 
cess accounting activity. 

system activity daily reports— provides procedures for sampling 
and saving system activities in a data file periodically and for gen- 
erating the daily report from the data file. 



The system activity information reported by this package is derived 
from a set of system counters located in the operation system kernel. 
These system counters are described in the section "System Activity 
Counters." The section "System Activity Commands" describes the 
commands provided by this package. The procedure for generating 
daily reports is given in "Daily Report Generation." For a description 



10-1 



• 



SYSTEM ACTIVITY PACKAGE 

of the files used by the system activity package, see the section "File 
Descriptions." 

2. System Activity Counters 

UniPlus"*" manages several counters that record various activities and 
provide the basis for the system activity reporting system. The data 
structure for most of these counters is defined in the sysinfo structure in 
/usr/include/sys/sysinfo.h . The system table overflow counters are kept in 
the _syserr structure. The device activity counters are extracted from 
the device status tables. In this version, the I/O activity of the follow- 
ing devices is recorded: RP06, RM05, RS04, RFll, RK05, RP03, 
RL02, TM03, andTMll. 

The following paragraphs describe the system activity counters sampled 
by the system activity package. 

Cpu time counters— There are four time counters that may be incre- 
mented at each clock interrupt 60 times per second. According to the 
mode the CPU is in at the interrupt (idle, user, kernel, and wait for 
I/O completion), one of the c/?w/7 counters is incremented. 

Lread and Iwrite— The Iread and Iwrite counters count logical read and 
write requests issued by the system to block devices. 

Bread and bwrite— The bread and bwrite counters count the number of 
times data is transferred between the system buff"ers and the block dev- 
ices. These actual I/Os are triggered by logical I/Os that cannot be 
satisfied by the current contents of the buffers. The ratio of block I/O 
to logical I/O is a common measure of the effectiveness of the system 
buffering. 

Phread and ph write— The phread and phwrite counters count read and 
write requests issued by the system to raw devices. 

Swapin and swapout— The swapin and swapout counters are incre- 
mented for each system request initiating a transfer from or to the swap 
device. More than one request is usually involved in bringing a process 
in to or out of memory because text and data are handled separately. 
Frequently- used programs are kept on the swap device and are swapped 



10-2 



SYSTEM ACTIVITY PACKAGE 

in rather than loaded from the file system. The swapin counter reflects 
these initial loading operations as well as resumptions of activity, while 
the swapout counter reveals the level of actual "swapping." The 
amount of data transferred between the swap device and memory are 
measured in blocks and counted by bswapin and bswapout. 

Pswitch and syscall— These counters are related to the management of 
multiprogramming. Syscall is incremented every time a system call is 
invoked. The numbers of invocations of read (2), write (2), fork (2), 
and exec (2) system calls are kept in counters sysread, syswrite, sysfork, 
and sysexec, respectively. Pswitch counts the times the switcher was 
invoked, which occurs when: 

1. A system call resulted in a road block 

2. An interrupt occurred resulting in awakening a higher priority 
process 

3. A 1 second clock interrupt occurred. 

Iget, namei, and dirblk— These counters apply to file-access operations. 
Iget and namei, in particular, are the names of UniPlus'*' routines. The 
counters record the number of times the respective routines are called. 
Namei is the routine that performs file system path searches. It 
searches the various directory files to get the associated i-number of a 
file corresponding to a special path. Iget is a routine called to locate the 
inode entry of a file (i-number). It first searches the in-core inode 
table. If the inode entry is not in the table, routine iget will get the 
inode from the file system where the file resides and make an entry in 
the in-core inode table for the file. Iget returns a pointer to this entry. 
Namei csdls iget, but other file access routines also call iget. Therefore, 
counter iget is always greater than counter namei. 

Counter dirblk records the number of directory block reads issued by 
the system. The directory blocks read divided by the number of namei 
calls estimates the average path length of files. 

Runque, runocc, swpque, and swpocc— These counters record queue 
activities. They are implemented in the c/ocA:.c routine. At every one- 
second interval, the clock routine examines the process table to see 
whether any processes are in core and in ready state. If so, the counter 
runocc is incremented and the number of such processes are added to 



10-3 



SYSTEM ACTIVITY PACKAGE 

counter mnque. While examining the process table, the clock routine 
also checks whether any processes in the swap device are in ready state. 
The counter swpocc is incremented if the swap queue is occupied, and 
the number of processes in swap queue is added to counter swpque. 

Readch and writech— The readch and writech counters record the total 
number of bytes (characters) transferred by the read and write system 
calls, respectively. 

Monitoring terminal device activities— There are six counters monitor- 
ing terminal device activities. Rcvint, xmtint, and mdmint are counters 
measuring hardware interrupt occurrences for receiver, transmitter, and 
modem individually. Rawch, canch, and outch count number of char- 
acters in the raw queue, canonical queue, and output queue. Charac- 
ters generated by devices operating in the cooked mode, such as termi- 
nals, are counted in both rawch and (as edited) in canch; but characters 
from raw devices, such as communication processors, are counted only 
in rawch. 

Msg and sema counters— These counters record message sending and 
receiving activities and semaphore operations, respectively. 

Monitoring I/O activities— As to the I/O activity for a disk or tape 
device, four counters are kept for each disk or tape drive in the device 
status table. Counter io_ops is incremented when an I/O operation has 
occurred on the device. It includes block I/O, swap I/O, and physical 
I/O. lo_bcnt counts the amount of data transferred between the device 
and memory in 512-byte units. lo_act and io_resp measure the active 
time and response time of a device in time ticks summed over all I/O 
requests that have completed for each device. The device active time 
includes the device seeking, rotating, and data transferring times, while 
the response time of an I/O operation is from the time the I/O request 
is queued to the device to the time when the I/O completes. 

Inodeovf, fileovf, textovf, and procovf— These counters are extracted 
from _syserr structure. When an overflow occurs in any of the inode, 
file, text, and process tables, the corresponding overflow counter is 
incremented. 



10-4 



SYSTEM ACTIVITY PACKAGE 



3. System Activity Commands 

The system activity package provides three commands for generating 
various system activity reports and one command for profiling disic 
activities. These tools facilitate observation of system activity during 

• A controlled stand-alone test of a large system. 

• An uncontrolled run of a program to observe the operating 
environment. 

• Normal production operation. 



Commands sar and sag permit the user to specify a sampling interval 
and number of intervals for examining system activity and then to 
display the observed level of activity in tabular or graphical form. The 
timex command reports the amount of system activity that occurred 
during the precise period of execution of a timed command. The sadp 
command allows the user to establish a sampling period during which 
access location and seek distance on specified disks are recorded and 
later displayed as a tabular summary or as a histogram. 

3.1 The "sar" Command 

The sar command can be used in the following two ways: 

• When the frequency arguments t and n are specified, it invokes 
the data collection program sadc to sample the system activity 
counters in the operating system every t seconds for n intervals 
and generates system activity reports in real-time. Generally, you 
will want to include the option to save the sampled data in a file 
for later examination. The format of the data file is shown in 
sar(lM). In addition to the system counters, a time stamp is also 
included. It gives the time at which the sample was taken. 



• 



If no frequency arguments are supplied, it generates system 
activity reports for a specified time interval from an existing data 
file that was created by sar at an earlier time. 



A convenient use is to run sar as a background process saving its sam- 
ples in a temporary file but sending its standard output to /dev/null. 
Then an experiment is conducted after which the system activity is 
extracted from the temporary file. The sar(l) manual entry describes 
the usage and lists various types of reports. See the section "Reporting 
Items," which gives the formula for deriving each reported item. 



10-5 



SYSTEM ACTIVITY PACKAGE 



3.2 The "sag" Command 

Sag displays system activity data graphically. It relies on the data file 
produced by a prior run of sar after which any column of data or the 
combination of columns of data of the sar report can be plotted. A 
fairly simple but powerful command syntax allows the specification of 
cross plots or time plots. Data items are selected using the sar column 
header names. The sar(lG) manual entry describes its options and 
usage. The system activity graphical program invokes graphics (IG) 
and tplot(lG) commands to have the graphical output displayed on any 
of the terminal types supported by tplot. 

3.3 The "timex" Command 

The timex command is an extension of the time(l) command. Without 
options, timex behaves like time. In addition to giving the time infor- 
mation, it can also print a system activity report and a process account- 
ing report. For all the options available, refer to the manual entry 
timex(l). It should be emphasized that the user and sys times reported 
in the second and third lines are for the measured process itself includ- 
ing all its children while the remaining data (including the "cpu user 
%" and "cpu sys %") are for the entire system. 

While the normal use of timex will probably be to measure a single 
command, multiple commands can also be timedeither by combining 
them in an executable file and timing it or by typing: 

timex sh — c "cmdl; cmd2; ... ;" 

This establishes the necessary parent-child relationships to correctly 
extract the user and system times consumed by cmdl, cmd2, ... (and 
the shell). 

3.4 The "sadp" Command 

Sadp is a user level program that can be invoked independently by any 
user. It requires no storage or extra code in the operating system and 
allows the user to specify the disks to be monitored. The program is 
reawakened every second, reads system tables from /dev/kmem, and 
extracts the required information. Because of the 1 second sampling, 
only a small fraction of disk requests are observed; however, compara- 
tive studies have shown that the statistical determination of disk locality 
is adequate when sufficient samples are collected. 



10-6 



SYSTEM ACTIVITY PACKAGE 

In the operating system, there is an iobuf for each disk drive. It con- 
tains two pointers which are head and tail of the I/O active queue for 
the device. The actual requests in the queue may be found in three 
buffer header pools— system buffer headers for block I/O requests, phy- 
sical buffer headers' for physical I/O requests, and swap buffer headers 
for swap I/O. Each buffer header has a forward pointer that points to 
the next request in the I/O active queue and a backward pointer that 
points to the previous request. 

Sadp snapshots the iobuf of the monitored device and the three buffer 
header pools once every second during the monitoring period. It then 
traces the requests in the I/O queue, records the disk access location, 
and seeks distance in buckets of 8-cylinder increments. At the end of 
monitoring period, it prints out the sampled data. The output of sadp 
can be used to balance load among disk drives and to rearrange the lay- 
out of a particular disk pack. This command is described in manual 
entry sadp(l). 

4. Daily Report Generation 

The previous part described the commands available to users to initiate 
activity observations. It is probably desirable for each installation to 
routinely monitor and record system activity in a standard way for his- 
torical analysis. This part describes the steps that a system administra- 
tor may follow to automatically produce a standard daily report of sys- 
tem activity. 

4.1 Facilities 

• sadc— The executable module of sadc.c (see "File Descriptions") 
which reads system counters from /dev/kmem and records them to 
a file. In addition to the file argument, two frequency arguments 
are usually specified to indicate the sampling interval and number 
of samples to be taken. In case no frequency arguments are 
given, it writes a dummy record in the file to indicate a system 
restart, 

• sal— The shell procedure that invokes sadc to write system 
counters in the daily data file /usr/adm/saM where dd represents 
the day of the month. It may be invoked with sampling interval 
and iterations as arguments. 

• sa2— The shell procedure that invokes the sar command to gen- 
erate daily report /usr/adm/sa/sarM from the daily data file 



10-7 



SYSTEM ACTIVITY PACKAGE 

/usr/adm/sa/saAA. It also removes daily data files and report files 
after 7 days. The starting and ending times and all report options 
of sar are applicable to sa2. 

4.2 Suggested Operational Setup 

It is suggested that the cron(lM) control the normal data collection and 
report generation operations. For example, the sample entries in 
/usr/spool/cron/crontab/sys: 

* * * 0,6 /usr/lib/sa/sal 

18-7 * * 1-5 /usr/lib/sa/sal 

8-17 * * 1-5 /usr/lib/sa/sal 1200 3 

would cause the data collection program sadc to be invoked every hour 
on the hour. Moreover, depending on the arguments presented, it 
writes data to the data file one to three times at every 20 minutes. 
Therefore, under the control of cron(lM), the data file is written every 
20 minutes between 8:00 and 18:00 on weekdays and hourly at other 
times. 

Note that data samples are taken more frequently during prime time on 
weekdays to make them available for a finer and more detailed graphi- 
cal display. It is suggested that sal be invoked hourly rather than 
invoking it once every day; this ensures that if the system crashes data 
collection will be resumed within an hour after the system is restarted. 

Because system activity counters restart from zero when the system is 
restarted, a special record is written on the data file to reflect this situa- 
tion. This process is accomplished by invoking sadc with no frequency 
arguments within /etc/rc when going to multiuser state: 

su adm — c 7usr/lib/sa/sadc /usr/adm/sa/sa'date +%d"' 

Cron(lM) also controls the invocation of sar to generate the daily 
report via shell procedure sa2. One may choose the time period the 
daily report is to cover and the groups of system activity to be reported. 
For instance, if: 

20 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:00 -i 3600 -uybd 

is an entry in /usr/spool/cron/crontab/sys, cron will execute the sar com- 
mand to generate daily reports from the daily data file at 20:00 on 
weekdays. The daily report reports the CPU utilization, terminal device 
activity, buffer usage, and device activity every hour from 8:00 to 
18:00. 

10-8 



SYSTEM ACTIVITY PACKAGE 



In case of a shortage of the disk space or for any other reason, these 
data files and report files can be removed by the superuser. The 
manual entry sar(lM) describes the daily report generation procedure. 

5. File Descriptions 

The source files and shell programs of the system activity package are 
in directory /usr/src/cmd/sa. 

sa.h The system activity header file defines the struc- 

ture of data file and device information for 
measured devices. It is included in sadc.c, 
sar.c, and timex.c. 

sadc.c The data collection program that accesses 

/dev/kmem to read the system activity counters 
and writes data either on standard output or on 
a binary data file. It is invoked by the sar com- 
mand generating a real-time report. It is also 
invoked indirectly by entries in 
/usr/spool/cron/crontab/sys to collect system 
activity data. 

sar.c The report generation program invokes sadc to 

examine system activity data, generates reports 
in real-time, and saves the data to a file for later 
use. It may also generate system activity reports 
from an existing data file. It is invoked 
indirectly by cron to generate daily reports. 

saghdr.h The header file for saga.c and sagb.c. It con- 

tains data structures and variables used by 
saga.c and sagb.c. 

saga.c & sagb.c The graph generation program that first invokes 

sar to format the data of a data file in a tabular 
form and then displays the sar data in graphical 
form. 

sal.sh The shell procedure that invokes sadc to write 

data file records. It is activated by entries in 
/usr/spool/cron/crontab/sys. 



10-9 



SYSTEM ACTIVITY PACKAGE 

saZ.sh The shell procedure that invokes sar to generate 

the report. It also removes the daily data files 
and daily report files after a week. It is activated 
by an entry in /usr/spool/cron/crontab/sys on week- 
days. 

timex.c The program that times a command and gen- 

erates a system activity or process accounting 
report. 

sadp.c The program that samples and reports disk 

activities. 



10-10 



SYSTEM ACTIVITY PACKAGE 



6. The '*sysinfo" Structure 



struct sysinfo 


{ 






time t 


cpu[4]; 


#define 


CPU IDLE 





#define 


CPU USER 


1 


#clefine 


CPU KERNAL 


2 


#define 


CPU_WAIT 


3 




time t 


wait[3]; 


#define 


W 10 





#define 


W SWAP 


1 


#define 


WPIO 


2 




long 


bread; 




long 


bwrite; 




long 


Iread; 




long 


Iwrite; 




long 


phread; 




long 


phwrite; 




long 


swapin; 




long 


swapout; 




long 


bswapin; 




long 


bswapout: 




long 


pswitch; 




long 


syscall; 




long 


sysread; 




long 


syswrite; 




long 


sysfork; 




long 


sysexec; 




long 


runque; 




long 


runocc; 




long 


swpque; 




long 


swpocc; 




long 


iget; 




long 


namei; 




long 


dirblk; 




long 


readch; 




long 


writech; 




long 


rcvint; 




long 


xmtint; 




long 


mdmint; 




long 


rawch; 




long 


canch; 




long 


outch; 




long 


msg; 




long 


sema; 



10-11 



SYSTEM ACTIVITY PACKAGE 

7. Reporting Items 

The derivation of the reported items is given in this section. Each item 
discussed below is the data difference sampled at two distinct times t2 
and tl. 

7.1 CPU Utilization 

%-of-cpu-x = cpu-x/ (cpu-idle + cpu-user+cpu-kernel + cpu-wait) * 10 
where cpu-x is cpu-idle, cpu-user, cpu-kernel (cpu-sys), or cpu-wait. 

7.2 Cache Hit Ratio 

%-of-cache-I/O = (logical-I/0 - block-I/0) / logical-I/0 * 100 
where cache I/O is cache read or cache write. 

7.3 Dislc or Tape I/O Activity 

%-of-busy = I/0-active / (t2 - tl) * 100; 
avg-queue-length = I/O-resp / I/0-active; 
avg-wait = (I/O-resp — I/0-active) / I/O-ops; 
avg-service-time = I/0-active / I/O-ops. 

7.4 Queue Activity 

avg-x-queue-length = x-queue / x-queue-occupied-time; 
%-of-x-queue-occupied-time = x-queue-occupied-time / (t2 — tl); 

where x-queue is run queue or swap queue. 

7.5 The Rest of System Activity 

avg-rate-of-x = x / (t2 — tl) 

where x is swap in/out, blks swapped in/ out, terminal device activities, 
read/write characters, block read/write, logical read/write, process 
switch, system calls, read/write, fork/exec, iget, namei, directory blocks 
read, disk/tape I/O activities, message, or semaphore activities. 



10-12 



Chapter 11: UUCP ADMINISTRATION 



CONTENTS 



1. Introduction 1 

2. Planning 1 

2.1 Extent of the Network 1 

2.2 Hardware and Line Speeds 2 

2.3 Maintenance and Administration 2 

3. Uucp Software 2 

4. Installation 3 

4.1 Object Modules 3 

4.2 Password File 3 

4.3 Lines File 4 

4.3.1 Naming Conventions 5 

4.4 System File-"L.sys" 5 

4.5 Dialing Prefixes— "L-dialcodes" 7 

4.6 Userfile 7 

4.7 Forwarding File 9 

5. Administration 9 

5.1 Cleanup 10 

5.1.1 Cleanup of Undeliverable Jobs 10 

5.1.2 Cleanup of the Public Area 10 

5.1.3 Compaction of Log Files 10 

5.2 Polling Other Systems 10 

5.3 Problems 10 

5.3.1 Out of Space 11 

5.3.2 Bad ACU and Modems 11 

5.3.3 Administrative Problems 11 

6. Debugging 11 



LIST OF FIGURES 

Figure 11.1. Uucp Network Daemon 13 

Figure 11.2. Uucico Daemon Functional Blocks 14 



Chapter 11 
UUCP ADMINISTRATION 



1. Introduction 

This chapter describes how a uucp network is set up, the format of con- 
trol files, and administrative procedures. Administrators should be 
familiar with the manual pages for each of the uucp related commands. 

2. Planning 

In setting up a network of UNIX systems, there are several considera- 
tions that should be taken into account before configuring each system 
on the network. The following parts attempt to outline the most 
important considerations. 

2.1 Extent of the Network 

Some basic decisions about access to processors in the network must be 
made before attempting to set up the configuration files. If an adminis- 
trator has control over only one processor and an existing network is 
being joined, then the administrator must decide what level of access 
should be granted to other systems. The other members of the net- 
work must make a similar decision for the new system. The UNIX sys- 
tem password mechanism is used to grant access to other systems. The 
file /usrAib/uucp/USERFILE restricts access by other systems to parts of 
the file system tree, and the file /usr/lib/uucp/L.sys on the local processor 
determines how many other systems on the network can be reached. 

When setting up more than one processor, the administrator has con- 
trol of a larger portion of the network and can make more decisions 
about the setup. For example, the network can be set up as a private 
network where only those machines under the direct control of the 
administrator can access each other. Granting no access to machines 
outside the network can be done if security is paramount; however, this 
is usually impractical. Very limited access can be granted to outside 
machines by each of the systems on the private network. Alternatively, 
access to/from the outside world can be confined to only one processor. 
This is frequently done to minimize the effort in keeping access infor- 
mation (passwords, phone numbers, login sequences, etc.) updated and 
to minimize the number of security holes for the private network. 



11-1 



UUCP ADMINISTRATION 

2.2 Hardware and Line Speeds 

There are only two supported means of interconnection by uucp(l), 

1. Direct connection using a null modem. 

2. Connection over the Direct Distance Dialing (DDD) network. 

In choosing hardware, the equipment used by other processors on the 
network must be considered. For example, if some systems on the net- 
work have only 103 -type (300-baud) data sets, then communication 
with them is not possible unless the local system has a 300-baud data 
set connected to a calling unit. (Most data sets available on systems are 
1200-baud.) If hard-wired connections are to be used between systems, 
then the distance between systems must be considered since a null 
modem cannot be used when the systems are separated by more than 
several hundred feet. The limit for communication at 9600-baud is 
about 800 to 1000 feet. However, the RS232 specification and Western 
Electric Support Groups only allow for less than 50 feet. Limited dis- 
tance modems must be used beyond 50 feet as noise on the lines 
becomes a problem. 

2.3 Maintenance and Administration 

There is a minimum amount of maintenance that must be provided on 
each system to keep the access files updated, to ensure that the network 
is running properly, and to track down line problems. When more than 
one system is involved, the job becomes more difficult because there 
are more files to update and because users are much less patient when 
failures occur between machines that are under local control. 

3. Uucp Software 

Figure ILl (at the end of this chapter) is an illustration of the dae- 
mons used by the uucp network to communicate with another system. 
The uucp(l) or uux(l) command queues users' requests and spawns 
the uucico daemon to call another system. Figure n.2 (at the end of 
this chapter) illustrates the structure of uucico and the tasks that it per- 
forms in communicating with another system. Uucico initiates the call 
to another system and performs the file transfer. On the receiving side, 
uucico is invoked to receive the transfer. Remote execution jobs are 
actually done by transferring a command file to the remote system and 
invoking a daemon (uuxqt) to execute that command file and return 
the results. 



11-2 



UUCP ADMINISTRATION 

4. Installation 

4.1 Object Modules 

The following object modules are installed as part of the uucp make 
procedure. 

1. uucp— The file transfer command (bin/uucp). 

2. uux— The remote execution command (bin/uux). 

3. uucico— The uucp network daemon (usr/lib/uucp/...). 

4. uustat— Network status command (bin/uustat). 

5. uuto— Sends source files to destination (bin/uuto). 

6. uulog— Queries a summary log of uucp and uux transactions 
(bin/uulog). 

7. uuname— lists the uucp names of known systems (bin/uuname). 

8. uuclean— Cleanup command (usr/lib/uucp/...). 

9. uusub— The command for monitoring and creating a subnetwork 
(bin/uusub). 

10. uuxqt— The remote execution daemon (usr/lib/uucp/...). 

11. uudemon.day— A shell procedure that is invoked each day to 
maintain the network. Shell scripts for execution each week 
(uudemon.wk) and each hour (uudemon.hr) are also distributed 
(usr/Hb/uucp/...). 

4.2 Password File 

To allow remote systems to call the local system, password entries must 
be made for any uucp logins. For example, 

nuucp:zaaAA:6:l:UUCP.Admin:/usr/spool/uucppublic:/usr/lib/uucp/ uucico 

Note that the uucico daemon is used for the shell, and the spool direc- 
tory is used as the working directory. 

There must also be an entry in the passwd file for an uucp administra- 
tive login. This login is the owner of all the uucp object and spooled 
data files and is usually "uucp". For example, the following is a entry in 
/etc/passwd for this administrative login; 



11-3 



UUCP ADMINISTRATION 

uucp:zAvLCKp:5:l:UUCP.Admin:/usr/lib/uucp: 
Note that the standard shell is used instead of uucico. 

4.3 Lines File 

The file /usrAib/uucp/L-devices contains the list of all lines that are 
directly connected to other systems or are available for calling other 
systems. The file contains the attributes of the lines and whether the 
line is a permanent connection or can call via a dialer. The format of 
the file is 

type line call-device speed protocol 

where each field is 

type Two keywords are used to describe whether a line is 

directly connected to another system (DIR) or uses an 
automatic calling unit (ACU). An X.25 permanent 
virtual circuit would use the DIR keyword. 

line This is the device name for the line (e.g., ttyab for a 

direct line, culO for a line connected to an ACU) . 

call-device If the ACU keyword is specified, this field contains the 

device name of the ACU. Otherwise, the field is 
ignored; however, a placeholder must be used in this 
field so that the protocol field can be interpreted. 

sped The line speed that the connection is to run at. (The 

speed field is currently ignored if an X.25 link is used.) 

protocol This is an optional field that needs only be filled in if 

the connection is for a protocol other than the default 
terminal protocol. The X.25 protocol is the only other 
protocol supported and the single character x is used to 
select this protocol. 



The following entries illustrate various types of connections: 

DIR ttyab 9600 
ACU culO cuaO 1200 
DIR x25.s0 300 X 

The first entry is for a hard-wired line running at 9600-baud between 
two systems. Note that the acu-device field is zero. The second entry is 
for a line with a 1200-baud ACU. The last entry is for an X.25 



11-4 



UUCP ADMINISTRATION 

synchronous direct connection between systems. Note that the protocol 
field is filled in and that the acu-device and line speed fields are meaning- 
less. 

4.3.1 Naming Conventions 

It is often useful when naming lines that are directly connected 
between systems or which are dedicated to calling other systems to 
choose a naming scheme that conveys the use of the line. In the ear- 
lier examples, the name ttyab is used for the line that directly connects 
two systems named a and b. Similarly, lines associated with calling 
units are best given names that relate them to the calling unit (note the 
names culO and cuaO to specify the line and calling unit, respectively). 

4.4 System File-"L.sys" 

Each entry in this file represents a system that can be called by the local 
UUCP programs. More than one line may be present for a particular sys- 
tem. In this case, the additional lines represent alternative communica- 
tion paths that will be tried in sequential order. The fields are 
described below. 

system name Name of the remote system. 

time This is a string that indicates the days-of-week and 

times-of-day when the system should be called (e.g., 
MoTuTh0800-1730). 



device 



The day portion may be a list containing 5'w, Mo, 7w, 
We^ Th, Fr, Sa; or it may be IVk for any week-day or 
Any for any day. The time should be a range of times 
(e.g., 0800 — 1230). If no time portion is specified, any 
time of day is assumed to be allowed for the call. Note 
that a time range that spans 0000 is permitted; 0800- 
0600 means all times are allowed other than times 
between 6 and 8 am. An optional subfield is available 
to specify the minimum time (minutes) before a retry 
following a failed attempt. The subfield separator is a 
"," (e.g., Any, 9 means call any time but wait at least 9 
minutes before retrying the call after a failure has 
occurred) . 

This is either ACU or the hard- wired device name to 
be used for the call. For the hard-wired case, the last 



11-5 



UUCP ADMINISTRATION 

part of the special file name is used (e.g., ttyO). 

class This is usually the line speed for the call (e.g., 300). 

phone The phone number is made up of an optional alpha- 

betic abbreviation (dialing prefix) and a numeric part. 
The abbreviation should be one that appears in the L- 
dialcodes file (e.g., mhl212, boston555 — 1212). For 
the hard-wired devices, this field contains the same 
string as used for the device field. 

login The login information is given as a series of fields and 

subfields in the format 

I expect send ] . . . 

where expect is the string expected to be read and send 
is the string to be sent when the expect string is 
received. 

The expect field may be made up of subfields of the 
form 

expect[ — send— expect] . . . 

where the send is sent if the prior expect is not success- 
fully read and the expect following the send is the next 
expected string. (For example, login— login will expect 
login; if it gets it, the program will go on to the next 
field; if it does not get login, it will send null followed 
by a new line, then expect login again.) If no characters 
are initially expected from the remote machine, the 
string "" (a null string) should be used in the first 
expect field. 

There are two special names available to be sent during 
the login sequence. The string EOT will send an EOT 
character, and the string BREAK will try to send a 
BREAK character. (The BREAK character is simu- 
lated using line speed changes and null characters and 
may not work on all devices and/or systems.) A 
number from 1 to 9 may follow the BREAK (e.g., 
BREAK], will send 1 null character instead of the 
default of 3). Note that BREAKl usually works best 
for 300-/1200-baud fines. 



11-6 



UUCP ADMINISTRATION 

A typical entry in the L.sys file would be 

sys Any ACU 300 mh7654 login uucp ssword: word 

The expect algorithm matches all or part of the input string as illus- 
trated in the password field above. 

4.5 Dialing Prefixes— "L-dialcodes" 

This file contains the dial-code abbreviations used in the L.sysfilQ (e.g., 
py, mh, boston). The entry format is 

abb dial-seq 

where abb is the abbreviation and dial-seq is the dial sequence to call 
that location. 

The line 

py 165- 

would be set up so that entry py7777 would send 165 — 7777 to the dial 
unit. 

4.6 Userflle 

The USERFILE contains user accessibility information. It specifies four 
types of constraints: 

1 . Files that can be accessed by a normal user of the local machine. 

2. Files that can be accessed from a remote computer. 

3. Login name used by a particular remote computer. 

4. Whether a remote computer should be called back in order to 
confirm its identity. 

Each line in the file has the format 

login, sys [ c ] pathname [ pathname ] ... 
where 

login is the login name for a user or the remote computer. 

sys is the system name for a remote computer. 

c is the optional call-back required flag. 



11-7 



UUCP ADMINISTRATION 

pathname is a pathname prefix that is acceptable for sys. 

The constraints are implemented as follows: 

1. When the program is obeying a command stored on the local 
machine, the pathnames allowed are those given on the first line 
in the USERFILE that has the login name of the user who 
entered the command. If no such line is found, the first line with 
a null login name is used. 

2. When the program is responding to a command from a remote 
machine, the pathnames allowed are those given on the first line 
in the file that has the system name that matches the remote 
machine. If no such line is found, the first one with a null system 
name is used. 

3. When a remote computer logs in, the login name that it uses must 
appear in the USERFILE. There may be several lines with the 
same login name but one of them must either have the name of 
the remote system or must contain a null system name. 

4. If the line matched in (3.) contains a "c", the remote machine is 
called back before any transactions take place. 

The line 

u,m /usr/xyz 

allows machine m to login with name u and request the transfer of files 
whose names start with /usr/xyz. The line 

you, /usr/you 

allows the ordinary user you to issue commands for files whose name 
starts with /usr/you. (This type restriction is seldom used.) The lines 

u,m /usr/xyz /usr/spool 
u, /usr/spool 

allows ««>' remote machine to login with name u. If its system name is 
not m, it can only ask to transfer files whose names start with /usr/spool. 
If it is system w, it can send files from paths /usr/xyz as well as 
/usr/spool. The lines 

root, / 
, /usr 



11-8 



UUCP ADMINISTRATION 

allow any user to transfer files beginning with /usr but the user with 
login root can transfer any file. (Note that any file that is to be 
transferred must be readable by anybody.) 

4.7 Forwarding File 

There are two files that allow restrictions to be placed on the forwarding 
mechanism. The format of the entries in each file is the same, 

system 
or 

system,user,user2,... 

The file ORIGFILE {/usr/lib/uucp/ORIGFlLE) restricts the access of sys- 
tems that are attempting to forward through the local system. The file 
contains the list of systems (and users) for whom the local system is 
willing to forward. Each entry refers to the system that was the source 
of the original job and not the name of the last system to forward the 
file. The second file, FWDFILE (/usr/lib/uucp/FWDFILE), is a list of 
valid systems that a job can be forwarded to. (It is not necessarily the 
name of the destination of a job, but merely the next valid node.) This 
file will be a subset of the L.sys file and can be used to prevent forward- 
ing to systems that are very expensive to reach but to which access by 
local users is allowed (e.g., links to overseas universities). If neither of 
these files exist, uucp will be perfectly happy to forward for any system. 
As an example, if the entry for system australia were in the ORIGFILE 
but not in the FWDFILE on system mhtsa, it would mean that system 
australia would be capable of forwarding jobs into the network via sys- 
tem mhtsa. However, no systems in the network could forward a job to 
australia via system mhtsa. 

5. Administration 

The role of the uucp administrator depends heavily on the amount of 
traffic that enters or leaves a system and the quality of the connections 
that can be made to and from that system. For the average system, 
only a modest amount of traffic (100 to 200 files per day) pass through 
the system and little if any intervention with the uucp automatic 
cleanup functions is necessary. Systems that pass large numbers of files 
(200 to 10,000) may require more attention when problems occur. The 
following parts describe the routine administrative tasks that must be 
performed by the administrator or are automatically performed by the 
uucp package. The part on problems describes what are the most fre- 
quent problems and how to eff"ectively deal with them. 



11-9 



UUCP ADMINISTRATION 



5.1 Cleanup 

The biggest problem in a dialup network like uucp is dealing with the 
backlog of jobs that cannot be transmitted to other systems. The fol- 
lowing cleanup activities should be routinely performed by shell scripts 
started from cron(l). 

5.1.1 Cleanup of Undeliverable Jobs 

The uudemon.day procedure usually contains an invocation of the 
uuclean command to purge any jobs that are older than some fixed 
time (usually 72 hours). A similar procedure is usually used to purge 
any lock or status files. An example invocation of uuclean (IM) to 
remove both job files and old status files every 48 hours is: 

/usr/lib/uucp/uuclean — pST — pC — n48 

5.1.2 Cleanup of the Public Area 

In order to keep the local file system from overflowing when files are 
sent to the public area, the uudemon.day procedure is usually set up 
with a find command to remove any files that are older than 7 days. 
This interval may need to be shortened if there is not sufficient space to 
devote to the public area. 

5.1.3 Compaction of Log Files 

The files SYSLOG and LOG FILE that contain logging information are 
compacted daily (using the pack command from the shell script 
uudemon.day) and should be kept for 1 week before being overwritten. 

5.2 Polling Other Systems 

Systems that are passive members of the network must be polled by 
other systems in order for their files to be sent. This can be arranged 
by using the uusub(l) command as follows: 

uusub — cmhtsd 

which will call mhtsd when it is invoked. 

5.3 Problems 

The following sections list the most frequent problems that appear on 
systems that make heavy use of uucp (1). 



11-10 



UUCP ADMINISTRATION 



5.3.1 Out of Space 

The file system used to spool incoming or outgoing jobs can run out of 
space and prevent jobs from being spawned or received from remote 
systems. The inability to receive jobs is the worse of the two condi- 
tions. When file space does become available, the system will be 
flooded with the backlog of traffic. 

5.3.2 Bad ACU and Modems 

The ACU and incoming modems occasionally cause problems that 
make it difficult to contact other systems or to receive files. These 
problems are usually readily identifiable since LOGFILE entries will 
usually point to the bad line. If a bad line is suspected, it is useful to 
use the cu(l) command to try calling another system using the 
suspected line. 

5.3.3 Administrative Problems 

Some UUCP networks have so many members that it is difficult to keep 
track of changing passwords, changing phone numbers, or changing 
logins on remote systems. This can be a very costly problem since 
ACU's will be tied up calling a system that cannot be reached. 

6. Debugging 

In order to verify that a system on the network can be contacted, the 
uucico daemon can be invoked from a user's terminal directly. For 
example, to verify that mhtsd can be contacted, a job would be queued 
for that system as follows: 

UUCP — r file mhtsd !~/tom 

The — r option forces the job to be queued but does not invoke the 
daemon to process the job. The uucico command can then be invoked 
directly: 

/usr/lib/uucp/uucico — rl — x4 — smhtsd 

The — rl option is necessary to indicate that the daemon is to start up 
in master mode (i.e., it is the calling system). The — x4 specifies the 
level of debugging that is to be printed. Higher levels of debugging can 
be printed (greater than 4) but requires familiarity with the internals of 
uucico. If several jobs are queued for the remote system, it is not pos- 
sible to force uucico to send one particular job first. The contents of 
LOGFILE should also be monitored for any error indications that it 



11-11 



UUCP ADMINISTRATION 

posts. Frequently, problems can be isolated by examining the entries in 
LOGFILE associated with a particular system. The file ERRLOG also 
contains error indications. 



11-12 



SYSTEM A 






SPOOL 


WORKLIST 


AREA 



INTERCONNECTION 
MEDIA 




SYSTEM B 

SPOOL 
AREA 




Figure 11.1 Uucp Network Daemon 



d 
a 

n 

> 



C/3 
H 

> 
H 

O 



o 

H 
< 
Pi 
H 
C/5 



P 



WORKLIST 




SEQUENCE 

AND 
INTERLOCK 



DIALING 



DDD 



DATAKIT 



(X.25) 



UUCICO DAEMON 



INITIAL 
CONNECTION 



FILE 
TRANSFER 
PROTOCOL 



BYTE 
STREAM 



PACKET 
PROTOCOL 



UNIX SYSTEM OS 



HARDWARE 



DDD NETWORK 



Figure 11.2 Uucico Daemon Functional Blocks 



